Sentiment Analysis: Project Overview

Project overview

Scoring process

Generate sentiment scores of earnings conference calls based key words mentioned in each paragraph.

When a key word appears in the conference call transcript, the paragraph is pulled out and scored using the sentiment model created.

Note: These scores and paragraphs need to be stored in a detail table with

Yahoo Ticker,

document type,

period,

date of call.

Paragraph

Score

Scoring program needs to be adjusted when it goes on VM to write directly to table after score generated.

Once the scores are generated then the data needs to be summarized into a summarized into a separate table which can drive graphs.

The process should read thru all the scores for that particular conference call,

Compute average scores.

Yahoo Ticker,

document type,

period,

Macro

Average score Sector trend

Number of times mentioned sector trend

Weighted average score sector trend by importance

Average score Financial metric

Number of times mentioned Financial metric

Weighted average financial metric by importance

Average score Regulation

Number of times mentioned regulation

Weighted average regulation score

Total score average

Total score weighted

Plotting process

Two main line graphs need to be developed with options.

First is the company sentiment driven by Yahoo ticker, beginning period, ending period.

Options are average score, weighted average score, number of times topics mentioned.

Second would be sector comparison with weighted/ average to be options.

Need front end to pull data from table based on selection and plot with Mathplotlib.

Support Tables

Need to be able to load yahoo tickers into the CompanyInfo table so we can classify stocks into sectors and collect daily prices.

Company

Yahoo Ticker

Long Name

Industry / Sector

Front end program needs to be able to load data from Flask HTML screens,

Back end load program needs to be able to list of companies based upon yahoo ticker as the main key.

Topics table

Topics

keywords

Sector

Classifications

Weight

Front end program needs to be developed to maintain the keywords to search thru.

Back end program needs to be developed to load data to table based upon keywork as the main key.

Loading process

All scored companies need to be loaded to sentiment detail table via load program making sure other key values (Yahoo Ticker,document type, period, date of call) can be added at time of load.

After transcripts loaded need to be scored to the summary table.

Flask App (Search field) -> Company

Ticker, Timeframe, etc...

Accomplishment:

Default template for Flask App has been generated. Working

on implementing interconnectedness between Flask App

and other parts of the project.

Bloomberg Terminal (for now using GCP

Bucket/local) -> returns the earnings call

data requested from the Flask App

Accomplishment:

Have been manually loading calls instead of automated

approach (bad) due to scraper of calls being a paid service

Datastore GCP (Stores the earnings calls

in the company's respective kind)

Accomplishment:

Code runs locally which stores the earnings calls

on datastore. Working on implementing flow such that

code works on cloud shell.

FinBert Sentiment Model

(.pkl file) resides on the

cloud and is run for the

requested calls

Accomplishment:

Currently working out of the cloud platform for

individual users but runs slower on the cloud due to

limited computation power (ram).

Datastore GCP (Stores the run sentiment

onto a temporary kind which is run

dynamically for each running of the Flask

App

Accomplishment:

Working on first datastore implementation part.

MatPlot lib -> Accesses data from

the GCP Datastore and runs for the

specified data which is sent back to

Flask App for displaying

Accomplishment:

Graphs are being generated with code. Working on

dynamic implementation.

Sentiment Analysis

Project Overview

No comments:

Post a Comment

Notes 3-18-25

Report Abuse