Sentiment Analysis: SEC files

SEC DATA - https://github.com/uconnstamford/Extracting-Data-From-SEC

On Compute engine VM

In directory home/public/sec/Extracting-Data-From-SEC

SEC Extraction code

======================================================================

import csv

from sec_api import ExtractorApi

from bs4 import BeautifulSoup

extractorApi = ExtractorApi("ENTER SEC API HERE")

filing_url = "https://www.sec.gov/Archives/edgar/data/1318605/000156459021004599/tsla-10k_20201231.htm"

section_text = extractorApi.get_section(filing_url, "1A", "text")

section_html = extractorApi.get_section(filing_url, "7", "html")

soup = BeautifulSoup(section_html, 'html.parser')

section_text_html_stripped = soup.get_text()

with open('sec_data.csv', mode='w', encoding='utf-8') as csv_file:

writer = csv.writer(csv_file)

writer.writerow(['Section', 'Content'])

writer.writerow(['1A', section_text])

writer.writerow(['7', section_text_html_stripped])

print("Data extracted ti sec_data.csv")

=======================================================================

The first command import csv allows the reading and writing of csv files (comma separated values)

https://docs.python.org/3/library/csv.html

The sec_spi module allows for access to the financial database where all US companies are required to file information regarding the performance of that company so shareholders can make knowledgeable investment decisions. E.G Public us companies are required to follow quarterly and yearly documents that provide investors with the audited financial results of the specified time period. 10-Q for quarterly results. 10-K for yearly results.

https://www.sec.gov/edgar/searchedgar/companysearch

The python module from sec_api import ExtractorApi

https://pypi.org/project/sec-api/

Allows API calls (Application programming interface) to the SEC database so the information can be returned electronically

The module from bs4 import BeautifulSoup

Beautiful Soup is a library that makes it easy to scrape information from web pages. It sits atop an HTML or XML parser, providing Pythonic idioms for iterating, searching, and modifying the parse tree.

extractorApi = ExtractorApi("ENTER SEC API HERE")

https://sec-api.io/docs/sec-filings-item-extraction-api

The Extractor API extracts any text section from 10-Q, 10-K and 8-K SEC filings, and returns the extracted content in cleaned and standardized text or HTML format. Send the URL of the filing, the section name (e.g. Item 1A) and the return data type (e.g. raw text) to the Extractor API and the extracted content is returned.

You can programmatically extract one or multiple text sections from any 10-Q, 10-K and 8-K filing. The extracted section item is returned as clear-text without HTML tags or standardized HTML. There is no need to develop your own item extraction algorithm anymore. Amended filings, such as 10-Q/A, 10-K/A and 8-K/A as well as all 10-K form variants, such as 10-KT, 10KSB, are also supported.

Dataset size: all sections of all 10-K, 10-Q and 8-K filings including their variants filed since 1994.

Sentiment Analysis

SEC files

No comments:

Post a Comment

Notes 3-18-25

Report Abuse