Project overview
Scoring process
Generate sentiment scores of earnings conference calls based key words mentioned in each paragraph.
When a key word appears in the conference call transcript, the paragraph is pulled out and scored using the sentiment model created.
Note: These scores and paragraphs need to be stored in a detail table with
Yahoo Ticker,
document type,
period,
date of call.
Paragraph
Score
Scoring program needs to be adjusted when it goes on VM to write directly to table after score generated.
Once the scores are generated then the data needs to be summarized into a summarized into a separate table which can drive graphs.
The process should read thru all the scores for that particular conference call,
Compute average scores.
Yahoo Ticker,
document type,
period,
Macro
Average score Sector trend
Number of times mentioned sector trend
Weighted average score sector trend by importance
Average score Financial metric
Number of times mentioned Financial metric
Weighted average financial metric by importance
Average score Regulation
Number of times mentioned regulation
Weighted average regulation score
Total score average
Total score weighted
Plotting process
Two main line graphs need to be developed with options.
First is the company sentiment driven by Yahoo ticker, beginning period, ending period.
Options are average score, weighted average score, number of times topics mentioned.
Second would be sector comparison with weighted/ average to be options.
Need front end to pull data from table based on selection and plot with Mathplotlib.
Support Tables
Need to be able to load yahoo tickers into the CompanyInfo table so we can classify stocks into sectors and collect daily prices.
Company
Yahoo Ticker
Long Name
Industry / Sector
Front end program needs to be able to load data from Flask HTML screens,
Back end load program needs to be able to list of companies based upon yahoo ticker as the main key.
Topics table
Topics
keywords
Sector
Classifications
Weight
Front end program needs to be developed to maintain the keywords to search thru.
Back end program needs to be developed to load data to table based upon keywork as the main key.
Loading process
All scored companies need to be loaded to sentiment detail table via load program making sure other key values (Yahoo Ticker,document type, period, date of call) can be added at time of load.
After transcripts loaded need to be scored to the summary table.
Flask App (Search field) -> Company
Ticker, Timeframe, etc...
Accomplishment:
Default template for Flask App has been generated. Working
on implementing interconnectedness between Flask App
and other parts of the project.
Bloomberg Terminal (for now using GCP
Bucket/local) -> returns the earnings call
data requested from the Flask App
Accomplishment:
Have been manually loading calls instead of automated
approach (bad) due to scraper of calls being a paid service
Datastore GCP (Stores the earnings calls
in the company's respective kind)
Accomplishment:
Code runs locally which stores the earnings calls
on datastore. Working on implementing flow such that
code works on cloud shell.
FinBert Sentiment Model
(.pkl file) resides on the
cloud and is run for the
requested calls
Accomplishment:
Currently working out of the cloud platform for
individual users but runs slower on the cloud due to
limited computation power (ram).
Datastore GCP (Stores the run sentiment
onto a temporary kind which is run
dynamically for each running of the Flask
App
Accomplishment:
Working on first datastore implementation part.
MatPlot lib -> Accesses data from
the GCP Datastore and runs for the
specified data which is sent back to
Flask App for displaying
Accomplishment:
Graphs are being generated with code. Working on
dynamic implementation.
No comments:
Post a Comment