Sentiment Analysis: March 2024

Saturday, March 30, 2024

Notes 3-30-24

The items below are the development that needs to be completed.

Please provide individual status updates for 4/2/24 meeting

Also, need a working model funning on VM (scoring model) and front end, app engine project. Working model does not need to be complete only what is developed and ready for the analysts.

Company Info Table	Flask process tp store data in Company Info
Topics Table	Flask process to store data in Topics
Graph Input	Flask to drive Mathplotlib graphs
Graph Average Scores	Graph to plot average scores for one company over periods selected
Graph Weighted Scores	Graph to plot weighted scores for one company over periods selected
Graph number of classifications	Graph to plot number of occurrences per classifications for one company over periods selected
Graph to plot average scores for sector	Graph to plot the average scores for each company in a selected sector for a selected period
Graph to plot weighted average scores for sector	Graph to plot the weighted average scores for each company in a selected sector for a selected period
Sentiment scoring	Program to take a conference call, identify each paragraph where a keyword appears, store the paragraph keyword and a sentiment score for that paragraph
Sentiment Detail	Program to take a scored conference call and summary each classification by average score, weighted score and number of times keyword is mentioned
Load conference calls	Flask to upload CC transcripts and execute scoring module
Validate Bank data	Need to review scores and check to make sure they are accurate
Load Topics	Load all keywords, classifications, sector and importance into table
Load Companies	Batch program to load companies to Company Info table custom key
Load exiting scores	Need load program to load existing scored data into sentiment detail table. Program needs to add Ticker
Linux	https://cloud.google.com/python/docs/getting-started/getting-started-on-compute-engine
	Prepare for Python

Tuesday, March 26, 2024

Notes 3-26

Periodic update of training models

Update existing training models

New training sources? Financial terminology

Customized training for specific terminology

Bank data should be in summary tables

Graphs

Key deliverables => Drill-> stock -> period -> -> classification

spot check (eyeball bank sentiment data to see if scoring makes sense)
compare to google sentiment api without training
trools to score data

Program to load cc from google storage bucket and generate se4ntiment detail and then run thru sentiment summary module

Tuesday, March 19, 2024

Monday, March 18, 2024

Sentiment Summary

Let take 1 record in the AHT LN file that had been scored.

CC_AHT LN_Q22022_6_14_2022

From the analysts

We have divided all words into three categories namely ‘Very Important’, ‘Important’ and ‘Less so important’ - please refer to attachment and the color coding. The category classification is to allow the student to ascribe a “weight” to each of the key words. For example, a key word that has been assign as ‘Very Important’ should have more weight/importance in the sentiment score than a key word that is ‘Less so important’.

Here’s an illustration of the weight for each of the category.

Very Important -> 2x

Important -> 1x

Less so Important -> 0.5x

Using these weights, the sentiment model score should be more “robust”.

What we have to figure out programmatically is how we want to handle the summary file. It is hard to specifically spec out without putting restrictions on the developers. Usually I give developers freedom to decide which is the best way to handle.

One method is:

If we have a scored paragraph

Key Words/Topics	Sector	Key Word Category
Inflation	All	Macro	Important
Interest rate	All	Macro	Important
Raw Materials inflation	Industrials	Macro	Less so important
Volume	All	Sector trend	Less so important
Revenue	All	Financial metric - All	Very Important
Earnings	All	Financial metric - All	Very Important
Earnings per Share / EPS	All	Financial metric - All	Very Important

So the paragraph

Key Word Category	Keyword	Paragraph	Sentiment Score
Financial metric - All	Revenue	So let's begin with highlights on Slide 3. Our business continues to perform very well, and experienced accelerated momentum throughout the year, demonstrating the strong levels of demand so clearly present. The strength delivered a record performance driven principally by a 23% increase in North America revenues which led to PBT of $1.8 billion at a 40% increase in earnings per share.	0.9398140907

Has a score of .93 because revenue is very important the weighted score for this would be 1.86 or 2x

When we do the summary and calculate all the scores we would use the ratio about to get weighted.

Yahoo Ticker	WM
document type	CC
period	Q123
Average score Macro	0.8
Number of times mentioned macro in document	10
Weighted average score of macro by importance		Calculated by the weigted value of the keywords found
Average score Sector trend	0.7
Number of times mentioned sector trend	12
Weighted average score sector trend by importance	.	Calculated by the weigted value of the keywords found
Average score Financial metric	0.5
Number of times mentioned Financial metric	11
Weighted average financial metric by importance		Calculated by the weigted value of the keywords found
Average score Regulation	0.9
Number of times mentioned regulation	8
Weighted average regulation score		Calculated by the weigted value of the keywords found
Total score average	Average unweighted scores
Total score weighted	Average weigted scores


Read thru all detail records for indivdual CC and put counters in place for category of field and calculate average score for category
Check the keyword for how it should be weighted so we can cal
increment counters as you scan thru all the records in the one call
when all detail records are counted write summary record


example below is one detail record

Key Word Category	Keyword	Paragraph	Sentiment Score
Financial metric - All	Revenue	So let's begin with highlights on Slide 3. Our business continues to perform very well, and experienced accelerated momentum throughout the year, demonstrating the strong levels of demand so clearly present. The strength delivered a record performance driven principally by a 23% increase in North America revenues which led to PBT of $1.8 billion at a 40% increase in earnings per share.	0.9398140907

Key Words/Topics	Sector	Key Word Category
Revenue	All	Financial metric - All	Very Important


Revenue is very important so we weight it 2x the value
Very Important -> 2x

Sentiment Analysis