Monday, November 13, 2023

Take-aways-11-10-23

 Rauni


I found the code I was looking at, and it uses matplotlib to create the 3D graph. I've attached the code and a png of the resulting graph to this email. My question: what would the sponsor want us to plot? Because the linear regression plot I had shown them last Friday was plotting average sentiment vs the stock price for the quarter, which didn't look too great. One change I made that greatly improved this graph was instead graphing the average sentiment vs the change in stock price between quarters, which I've also attached.


Now my question is what are the sets I should be plotting on the multiple regression plot? We have the various categories, which we get the sentiment average from, but what would plotting that look like? We have 4 categories: Macro, Sector Trend, Regulation and Financial Metric. Each has its own sentiment score per quarter. So possible points of data are category sentiment, average quarter sentiment, stock price, and change of stock price.


Over 8 quarters the 3 variables are:


  1. Stock change in px (in local currency), eg. WSP

  2. Sentiment of firms transcripts (WSP earnings reports)

  3. Sentiment of sector (engineering and design for WSP) by averaging sentiment of firms in the sector

Here's what I'm looking to complete for the week:


1. Run sentiment on streamlined keywords list

2. Make multiple regression plot using the guidelines discussed on Friday

3. Modify how code takes input and output, label quarters with year (I.E instead of Q1, Q2, etc.. It's Q3 2021, Q4 2021, Q1 2022, etc.)


Charlton

Upload Excel sentiment results files to DataStore

  • data management


Continue data documentation

  • Data table for terms




Saturday, November 11, 2023

Notes 11/11/23

 I have analyzed regression equation for WSP share price. I am of the opinion that Regression analysis won’t help us in arriving the weights as the co-efficient of independent variables can be negative at times. For example in the following regression equation, we can see co-efficient for financial sentiment is -0.057 and Sector sentiment is -0.019. I have tried different multi-regression equations by slicing and dicing the variables and ended-up with negative co-efficient. I spoke to Patrick on this, we are of the opinion that we can hold-on to multi-regression to save students time and effort.  

We can proceed with simple average/sum of sentiment score for the project. Please help me pass-on the message.

I have given Python code and excel spreadsheet. Students can run the code using the above spread sheet if they want to take a look into it.  Let me know if you have any thoughts on this.

Regression equation for WSP share price

Share Price = 2.2869 + (0.3070) * Macro sentiment + (-0.0576) * Financial sentiment + (-0.0109) * Sector sentiment

Python Code:
import pandas as pd
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt

# Function to load data, perform regression, plot the results, and return the equation
def regress_and_plot(file_path):
    # Load the data
    data = pd.read_excel(file_path, index_col=0).T  # Transpose to have features as columns
    
    # Calculate percentage changes
    data_pct_change = data.pct_change().dropna() * 100  # convert to percentage
    
    # Independent variables (sentiment scores)
    X = data_pct_change.iloc[:, :-1]  # Assuming the last column is the share price
    
    # Dependent variable (share price percentage change)
    y = data_pct_change.iloc[:, -1]  # Assuming the last column is the share price
    
    # Initialize and fit the model
    model = LinearRegression()
    model.fit(X, y)
    
    # Generate predictions
    predictions = model.predict(X)
    
    # Plotting the actual versus predicted values
    plt.figure(figsize=(10, 6))
    plt.scatter(range(len(y)), y, color='blue', label='Actual Percentage Change')
    plt.plot(range(len(y)), predictions, color='red', linestyle='--', label='Predicted Percentage Change')
    
    # Add titles and labels
    plt.title('Regression Line for Share Price Percentage Change')
    plt.xlabel('Time')
    plt.ylabel('Percentage Change in Share Price')
    plt.legend()
    plt.grid(True)
    plt.show()
    
    # Construct the regression equation
    equation = f"Share Price = {model.intercept_:.4f}"
    for coef, name in zip(model.coef_, X.columns):
        equation += f" + ({coef:.4f}) * {name}"
    
    return equation

# Path to the uploaded Excel file
file_path = 'C:/WSP sentimentV2.xlsx' #two months lag in share price from respective quarter to account for earnings call date

# Call the function to perform regression, plot the graph, and get the equation
regression_equation = regress_and_plot(file_path)
print(regression_equation)

Saturday, November 4, 2023

Notes 10/25

Good start need a more detailed breakdown of how this will be delivered.

Need to store paragraphs in datastore with scored and magnitudes.
Then need to loop thru to total in classifications.
Keep writing down the steps needed to accomplish task.
e.g. Pull down data
Store in file
search for words
pull out paragraph
run sentiment on paragraph.
More detailed better specs.
Everything needs to run on the cloud.


Extract Conference Calls from Bloomberg -> Haris 
Average scores per keyword category & run it on multiple conference calls-> Rauni
Analyze the output for each and find trends -> Charlton
Add instructions in README.md -> Haris
Update Slideshow -> Charlton

Notes 3-18-25

https://uconn-sa.blogspot.com/  We were able to launch an app engine program from our compute engine instance.   I'd like to get all wo...