Saturday, March 30, 2024

Notes 3-30-24

 


The items below are the development that needs to be completed.

Please provide individual status updates for 4/2/24 meeting

Also, need a working model funning on VM (scoring model) and front end, app engine project. Working model does not need to be complete only what is developed and ready for the analysts.


Company Info TableFlask process tp store data in Company Info
Topics TableFlask process to store data in Topics
Graph InputFlask to drive Mathplotlib graphs
Graph Average ScoresGraph to plot average scores for one company over periods selected
Graph Weighted ScoresGraph to plot weighted scores for one company over periods selected
Graph number of classificationsGraph to plot number of occurrences per classifications for one company over periods selected
Graph to plot average scores for sectorGraph to plot the average scores for each company in a selected sector for a selected period
Graph to plot weighted average scores for sectorGraph to plot the weighted average scores for each company in a selected sector for a selected period
Sentiment scoringProgram to take a conference call, identify each paragraph where a keyword appears, store the paragraph keyword and a sentiment score for that paragraph
Sentiment DetailProgram to take a scored conference call and summary each classification by average score, weighted score and number of times keyword is mentioned
Load conference callsFlask to upload CC transcripts and execute scoring module
Validate Bank dataNeed to review scores and check to make sure they are accurate
Load TopicsLoad all keywords, classifications, sector and importance into table
Load CompaniesBatch program to load companies to Company Info table custom key
Load exiting scoresNeed load program to load existing scored data into sentiment detail table. Program needs to add Ticker
Linuxhttps://cloud.google.com/python/docs/getting-started/getting-started-on-compute-engine
Prepare for Python

Tuesday, March 26, 2024

Notes 3-26

 Periodic update of training models


Update existing training models

New training sources? Financial terminology

Customized training for specific terminology

Bank data should be in summary tables

Graphs

Key deliverables => Drill-> stock -> period -> -> classification

  • spot check (eyeball bank sentiment data to see if scoring makes sense)
  • compare to google sentiment api without training
  • trools to score data

Program to load cc from google storage bucket and generate se4ntiment detail and then run thru sentiment summary module


Tuesday, March 19, 2024

NOTES 3-19-24

 Categories


Presentation


Need to have the 6 banks loaded.


Here a tickers of large US banks: JPM US, C US, BAC US, GS US, MS US, WFC US


Store them on Google cloud storage bucket


Using the naming convention


TYPE - CC, YAHOO TICKER - JPM, PERIOD - Q123, DATE OF CALL 051023


CC_JPM_Q123_051023



NEED 48 CONFERENCE CALLS DONE 8 PERIODS AND 6 BANKS


GRAPHS FOR EACH COMPANY


SCORE FOR 8 PERIODS

WEIGHTED SCORE FOR 8 PERIODS

NUMBER OF TIMES EACH CATEGORY OCCURS IN CC FOR 8 PERIODS


SECTOR COMPARISON


AVERAGE SCORE OF ALL 6 COMPANIES FOR 8 PERIODS

WEIGHTED AVERAGE SCORE FOR EACH COMPANY FOR 8 PERIODS


FRONT END


NEED MENU FOR 4 SECTIONS OF CODE


  1. COMPANY INFO - CREATE, READ, UPDATE AND DELETE FOR ALL INFORMATION IN THE COMPANY INFO TABLE

  2. TOPIC INFO - CREATE, READ, UPDATE, DELETE FOR ALL INFOIRMATION IN THE TOPIC INFO TABLE

  3. INPUT SCREEN TO DRIVE GRAPHS - DESIGN HOW THIS SHOULD BE DRIVEN

COMPANY GRAPH OR SECTOR GRAPH

ONCE DECLARED WHEN COMPANY WE HAVE 

TICKER\

PERIODS

TYPE OF GRAPH, SCORE, WEIGHTED, NUMBER OF TIMES


FOR SECTOR

SECTOR NAME

PERIODS

AVERAGE SCORES 

WEIGHTED SCORE


  1. UPLOAD PROCESS ? WAIT ON THIS


DATA LOADS.


LOADER PROGRAM TO PUT SCORES THAT ALREADY ARE IN SPREADSHEETS AND LOAD INTO SENTIMENT DETAIL


NEED ALL DATA LOADED FOR COMPANY INFO ENTER FRONT END OR LOADER

NEED ALL DATA LOADED FOR TOPICS INFO EITHER FRONT END OF LOADER


SENTIMENT SUMMARY PROGRAM - NEED TO SCAN A CALL AND CALCULATE ALL FIELDS NEEDED TO DRIVE GRAPHS AND STORE IN SENTIMENT SUMMARY TABLE.


AI / ML


NEED TO HAVE PROGRAMS RUN IN PRODUCTION ON VM

MUST BE ABLE TO STORE DATA IN SENTIMENT DETAIL

NEED TO START BUILDING MODEL SO IT CAN LEARN FROM DATA WE SCORE


Monday, March 18, 2024

Sentiment Summary

Let take 1 record in the AHT LN file that had been scored.


CC_AHT LN_Q22022_6_14_2022



From the analysts


We have divided all words into three categories namely ‘Very Important’, ‘Important’ and ‘Less so important’ - please refer to attachment and the color coding. The category classification is to allow the student to ascribe a “weight” to each of the key words. For example, a key word that has been assign as ‘Very Important’ should have more weight/importance in the sentiment score than a key word that is ‘Less so important’.

 

Here’s an illustration of the weight for each of the category.

Very Important -> 2x

Important -> 1x

Less so Important -> 0.5x

Using these weights, the sentiment model score should be more “robust”.


What we have to figure out programmatically is how we want to handle the summary file. It is hard to specifically spec out without putting restrictions on the developers. Usually I give developers freedom to decide which is the best way to handle.


One method is:


If we have a scored paragraph


Key Words/Topics

Sector

Key Word Category


Inflation

All

Macro

Important

Interest rate

All

Macro

Important

Raw Materials inflation

Industrials

Macro

Less so important

Volume

All

Sector trend

Less so important

Revenue

All

Financial metric - All

Very Important

Earnings

All

Financial metric - All

Very Important

Earnings per Share / EPS

All

Financial metric - All

Very Important


So the paragraph



Key Word Category

Keyword

Paragraph

Sentiment Score

Financial metric - All

Revenue

So let's begin with highlights on Slide 3. Our business continues to perform very well, and experienced accelerated momentum throughout the year, demonstrating the strong levels of demand so clearly present. The strength delivered a record performance driven principally by a 23% increase in North America revenues which led to PBT of $1.8 billion at a 40% increase in earnings per share.

0.9398140907


Has a score of .93 because revenue is very important the weighted score for this would be 1.86 or 2x


When we do the summary and calculate all the scores we would use the ratio about to get weighted.



Yahoo Ticker

WM




document type

CC




period

Q123




Average score Macro

0.8




Number of times mentioned macro in document

10




Weighted average score of macro by importance


Calculated by the weigted value of the keywords found



Average score Sector trend

0.7




Number of times mentioned sector trend

12




Weighted average score sector trend by importance

.

Calculated by the weigted value of the keywords found



Average score Financial metric

0.5




Number of times mentioned Financial metric

11




Weighted average financial metric by importance


Calculated by the weigted value of the keywords found



Average score Regulation

0.9




Number of times mentioned regulation

8




Weighted average regulation score


Calculated by the weigted value of the keywords found



Total score average

Average unweighted scores




Total score weighted

Average weigted scores














Read thru all detail records for indivdual CC and put counters in place for category of field and calculate average score for category





Check the keyword for how it should be weighted so we can cal





increment counters as you scan thru all the records in the one call





when all detail records are counted write summary record















example below is one detail record










Key Word Category

Keyword

Paragraph

Sentiment Score


Financial metric - All

Revenue

So let's begin with highlights on Slide 3. Our business continues to perform very well, and experienced accelerated momentum throughout the year, demonstrating the strong levels of demand so clearly present. The strength delivered a record performance driven principally by a 23% increase in North America revenues which led to PBT of $1.8 billion at a 40% increase in earnings per share.

0.9398140907







Key Words/Topics

Sector

Key Word Category



Revenue

All

Financial metric - All

Very Important












Revenue is very important so we weight it 2x the value





Very Important -> 2x















 

Notes 3-18-25

https://uconn-sa.blogspot.com/  We were able to launch an app engine program from our compute engine instance.   I'd like to get all wo...