Specifications/Requirments/Notes:
Gets the latest 10k filings from Tesla and Apple.
Why we need this: in order to run sentiment analysis on how the company talks about itself, we need to extract from its SEC filings. The problem is the SEC extractor API requires urls. We cannot hard-code those urls in since new data will constantly have new urls.
This code will get the most recent filings and a bunch of info about it, including a link to where the report is stored on the SEC website!
Questions and Breaking Down into Smaller Subtasks
In addition to tickers (like TSLA), SEC also assigns companies a CIK number. Ideally, we stick to only one form of identification.
Research if we can make a query with the SEC Query API using tickers instead of CIK
Can we get a larger list of tickers/ work with other teams to see which tickers we need to collect.
Build a map between ticker and cik
Code for acessing from master table
I got a key from https://sec-api.io/register but I will be limited to 100 queries a month.
Figure out if we can get a key with unlimited queries / see if we can use the API without an API key.
Professor said this limit is okay.
How can we get the code to run on a regular basis?
Every week or something like that.
Google Scheduler / Cron
How to get a CSV file stored into Entity?
Talk to other teams
Data Store
Problem is already solved look at blog
Photos
I downloaded the CSV file generated by the file and it looks like this on my Trio Office Software:
VERSION 1 - gives some error - look at Version 2 for correct code
import csv
#pip install with this if sec is giving error (without the quotes): 'pip install sec-api'
from sec_api import QueryApi
myQuery = QueryApi(api_key = "key")
query = {
"query": {
"query_string": {
"query": "cik:(320193 OR 1318605) AND formType:\"10-K\""
}
},
"from": "0",
"size": "5",
"sort": [{ "filedAt": { "order": "desc" } }]
}
queryResponse = myQuery.get_filings(query)
filings = queryResponse["filings"]
#needed to look at documentation very helpful for next part
#https://sec-api.io/docs
field_names = ["id", "accessionNo", "companyName", "companyNameLong", "ticker",
"cik", "filedAt", "items", "formType", "periodOfReport",
"linkToHtml", "linkToFilingDetails", "linkToTxt", "description",
"documentFormatFiles", "dataFiles", "seriesAndClassesContractsInformation",
"linkToXbrl", "entities"]
with open('recent_10k_filings.csv', 'w', encoding='UTF8') as file:
writer = csv.DictWriter(file, fieldnames=field_names)
writer.writeheader()
writer.writerows(filings)
VERSION 2
import csv
#pip install with this if sec is giving error (without the quotes): 'pip install sec-api'
from sec_api import QueryApi
myQuery = QueryApi(api_key = "YOUR KEY HERE")
query = {
"query": {
"query_string": {
"query": "cik:(320193 OR 1318605) AND formType:\"10-K\""
}
},
"from": "0",
"size": "20",
"sort": [{ "filedAt": { "order": "desc" } }]
}
queryResponse = myQuery.get_filings(query)
filings = queryResponse["filings"]
#needed to look at documentation very helpful for next part
#https://sec-api.io/docs
field_names = ["id", "accessionNo", "companyName", "linkToHtml"]
with open('recent_10k_filings.csv', 'w', encoding='UTF8') as file:
writer = csv.DictWriter(file, fieldnames=field_names, extrasaction='ignore')
writer.writeheader()
writer.writerows(filings)
No comments:
Post a Comment