Let take 1 record in the AHT LN file that had been scored.
CC_AHT LN_Q22022_6_14_2022
From the analysts
We have divided all words into three categories namely ‘Very Important’, ‘Important’ and ‘Less so important’ - please refer to attachment and the color coding. The category classification is to allow the student to ascribe a “weight” to each of the key words. For example, a key word that has been assign as ‘Very Important’ should have more weight/importance in the sentiment score than a key word that is ‘Less so important’.
Here’s an illustration of the weight for each of the category.
Very Important -> 2x
Important -> 1x
Less so Important -> 0.5x
Using these weights, the sentiment model score should be more “robust”.
What we have to figure out programmatically is how we want to handle the summary file. It is hard to specifically spec out without putting restrictions on the developers. Usually I give developers freedom to decide which is the best way to handle.
One method is:
If we have a scored paragraph
Key Words/Topics | Sector | Key Word Category |
|
Inflation | All | Macro | Important |
Interest rate | All | Macro | Important |
Raw Materials inflation | Industrials | Macro | Less so important |
Volume | All | Sector trend | Less so important |
Revenue | All | Financial metric - All | Very Important |
Earnings | All | Financial metric - All | Very Important |
Earnings per Share / EPS | All | Financial metric - All | Very Important |
So the paragraph
Key Word Category | Keyword | Paragraph | Sentiment Score |
Financial metric - All | Revenue | So let's begin with highlights on Slide 3. Our business continues to perform very well, and experienced accelerated momentum throughout the year, demonstrating the strong levels of demand so clearly present. The strength delivered a record performance driven principally by a 23% increase in North America revenues which led to PBT of $1.8 billion at a 40% increase in earnings per share. | 0.9398140907 |
Has a score of .93 because revenue is very important the weighted score for this would be 1.86 or 2x
When we do the summary and calculate all the scores we would use the ratio about to get weighted.
Yahoo Ticker | WM |
|
|
|
document type | CC |
|
|
|
period | Q123 |
|
|
|
Average score Macro | 0.8 |
|
|
|
Number of times mentioned macro in document | 10 |
|
|
|
Weighted average score of macro by importance |
| Calculated by the weigted value of the keywords found |
|
|
Average score Sector trend | 0.7 |
|
|
|
Number of times mentioned sector trend | 12 |
|
|
|
Weighted average score sector trend by importance | . | Calculated by the weigted value of the keywords found |
|
|
Average score Financial metric | 0.5 |
|
|
|
Number of times mentioned Financial metric | 11 |
|
|
|
Weighted average financial metric by importance |
| Calculated by the weigted value of the keywords found |
|
|
Average score Regulation | 0.9 |
|
|
|
Number of times mentioned regulation | 8 |
|
|
|
Weighted average regulation score |
| Calculated by the weigted value of the keywords found |
|
|
Total score average | Average unweighted scores |
|
|
|
Total score weighted | Average weigted scores |
|
|
|
|
|
|
|
|
|
|
|
|
|
Read thru all detail records for indivdual CC and put counters in place for category of field and calculate average score for category |
|
|
|
|
Check the keyword for how it should be weighted so we can cal |
|
|
|
|
increment counters as you scan thru all the records in the one call |
|
|
|
|
when all detail records are counted write summary record |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
example below is one detail record |
|
|
|
|
|
|
|
|
|
Key Word Category | Keyword | Paragraph | Sentiment Score |
|
Financial metric - All | Revenue | So let's begin with highlights on Slide 3. Our business continues to perform very well, and experienced accelerated momentum throughout the year, demonstrating the strong levels of demand so clearly present. The strength delivered a record performance driven principally by a 23% increase in North America revenues which led to PBT of $1.8 billion at a 40% increase in earnings per share. | 0.9398140907 |
|
|
|
|
|
|
Key Words/Topics | Sector | Key Word Category |
|
|
Revenue | All | Financial metric - All | Very Important |
|
|
|
|
|
|
|
|
|
|
|
Revenue is very important so we weight it 2x the value |
|
|
|
|
Very Important -> 2x |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
No comments:
Post a Comment