edgar

package module
v0.0.0-...-5347cc3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 1, 2024 License: Apache-2.0 Imports: 15 Imported by: 0

README

Edgar

A crawler to get company filing data from XBRL filings. The fetcher parses through the HTML pages and extracts data based on the XBRL tags that it finds and collects it into filing data arranged by filing date.

Goals

The primary requirement of any stock analysis is the data that the comapny files with the SEC. The filings, starting from 2010-11 required to be done using XBRL tags which required the filer to tag each data point with a tag. This made identifying and classifying the data a lot easier for machines.

The goal of this package is to get publically available filings on sec.gov and parse the filings to get data origanized into queriable data points. This package is being ddeveloped with the goal to provide an interface for other packages to get filing data for a company and use that data to create insights such as valuations and trends.

Design

The package is primarily organized as multiple interfaces into the provided functionality. The user is expected to make use of these interfaces to gather the data and query it as needed

Interfaces

All interfaces needed to use this package is defined in edgar.go and described below.

FilingFetcher

This is the starting point for use of this package. The package is initialized with a fetcher. The user will use the fetcher interface to provide a ticker and filing type to startup a company folder. The user has an additional API in the interface to initialize a company folder with a saved folder.

CompanyFolder

A user will be given a company folder with the filings (retrieved ones) for every company (ticker). The user uses the folder to get any filing information related to that company. The filings are indexed internally based on filing type and the date of filing. When a user of the package requests a filing, the filing is looked up in the cache and if not available, will be retrieved from edgar and populated into the folder.

Filing

Filing is an interface to get filing data related to a specific filing. The user uses this interface to extract required data. The Filing is retrieved from the company folder as needed. An error is returned if the data was unavailable.

Installing the Package

To install the Edgar package, run the following command:

go get github.com/hyssa-dev/edgar

fetcher := edgar.NewFilingFetcher()

Documentation

Index

Constants

This section is empty.

Variables

View Source
var CategoryDocs = map[string][]Document{
	"Cover": {
		{
			Category:   "Cover",
			Name:       "Cover Page",
			Type:       "Entity Info",
			Keys:       []string{"DOCUMENT", "ENTITY"},
			NotKeys:    []string{},
			IsRequired: false,
		},
		{
			Category:   "Cover",
			Name:       "Cover Page",
			Type:       "Entity Info",
			Keys:       []string{"COVER"},
			NotKeys:    []string{},
			IsRequired: false,
		},
	},
	"Financial statements": {
		{
			Category:   "Financial statements",
			Name:       "Balance sheet",
			Type:       "Assets",
			Keys:       []string{"BALANCE SHEET"},
			NotKeys:    []string{},
			IsRequired: false,
		},
		{
			Category:   "Financial statements",
			Name:       "Balance sheet",
			Type:       "Assets",
			Keys:       []string{"FINANCIAL POSITION"},
			NotKeys:    []string{},
			IsRequired: false,
		},
		{
			Category:   "Financial statements",
			Name:       "Operations statement",
			Type:       "Operations",
			Keys:       []string{"OPERATIONS"},
			NotKeys:    []string{},
			IsRequired: false,
		},
		{
			Category:   "Financial statements",
			Name:       "Income statement",
			Type:       "Income",
			Keys:       []string{"INCOME"},
			NotKeys:    []string{},
			IsRequired: false,
		},
		{
			Category:   "Financial statements",
			Name:       "Income statement",
			Type:       "Income",
			Keys:       []string{"INCOME", "EARNINGS"},
			NotKeys:    []string{},
			IsRequired: false,
		},
		{
			Category:   "Financial statements",
			Name:       "Cash flow statement",
			Type:       "Cash Flow",
			Keys:       []string{"CASH FLOWS"},
			NotKeys:    []string{},
			IsRequired: false,
		},
		{
			Category:   "Financial statements",
			Name:       "Parenthetical statement",
			Type:       "Parenthetical",
			Keys:       []string{"PARENTHETICAL"},
			NotKeys:    []string{},
			IsRequired: false,
		},
	},
	"Notes to Financial statements": {
		{
			Category:   "Notes to Financial statements",
			Name:       "Notes - Notes on EPS",
			Type:       "Notes on EPS",
			Keys:       []string{"EARNINGS", "SHARE"},
			NotKeys:    []string{},
			IsRequired: false,
		},
		{
			Category:   "Notes to Financial statements",
			Name:       "Notes - Notes on Equity",
			Type:       "Notes on Equity",
			Keys:       []string{"SHAREHOLDER", "EQUITY"},
			NotKeys:    []string{},
			IsRequired: false,
		},
		{
			Category:   "Notes to Financial statements",
			Name:       "Notes - Notes on Debt",
			Type:       "Notes on Debt",
			Keys:       []string{"DEBT"},
			NotKeys:    []string{},
			IsRequired: false,
		},
	},
	"Notes Tables": {{
		Category:   "Notes Tables",
		Name:       "Notes - Notes on EPS",
		Type:       "Tables",
		Keys:       []string{"(TABLES)"},
		NotKeys:    []string{},
		IsRequired: false,
	},
		{
			Category:   "Notes Tables",
			Name:       "Accounting Policies (Tables)",
			Type:       "PoliciesTable",
			Keys:       []string{"Accounting Policies (Tables)"},
			NotKeys:    []string{""},
			IsRequired: false,
		},
	},
	"Notes Details": {
		{
			Category:   "}Notes Details",
			Name:       "Notes Details",
			Type:       "Details",
			Keys:       []string{"(DETAILS)", "(DETAIL)"},
			NotKeys:    []string{""},
			IsRequired: false,
		},
	},

	"Accounting Policies": {
		{
			Category:   "Accounting Policies",
			Name:       "Accounting Policies",
			Type:       "Policies",
			Keys:       []string{"(POLICIES)"},
			NotKeys:    []string{""},
			IsRequired: false,
		},
	},
}
View Source
var MenuCategories = []MenuCategory{
	{
		Name: "Cover",
		Keys: []string{"cover"},
	},
	{
		Name: "Financial statements",
		Keys: []string{"financial", "statement"},
	},
	{
		Name: "Notes to Financial statements",
		Keys: []string{"note", "financial", "statement"},
	},
	{
		Name: "Notes Tables",
		Keys: []string{"table"},
	},
	{
		Name: "Notes Details",
		Keys: []string{"detail"},
	},
	{
		Name: "Accounting Policies",
		Keys: []string{"polic"},
	},
}
View Source
var RequiredDocs = []Document{
	{
		Category:   "",
		Name:       "Operations",
		Type:       "Operations",
		Keys:       []string{},
		NotKeys:    []string{},
		IsRequired: true,
	},
	{
		Name:       "Income",
		Type:       "Income",
		IsRequired: true,
	},
	{
		Name:       "Assets",
		Type:       "Assets",
		IsRequired: true,
	},
	{
		Name:       "Cash Flow",
		Type:       "Cash Flow",
		IsRequired: true,
	},
	{
		Name:       "Entity Info",
		Type:       "Entity Info",
		IsRequired: true,
	},
}
View Source
var RestrictedTags = map[string]bool{}
View Source
var XBRLTags = map[string]string{}

A Map of XBRL tags to financial data type This map contains the corresponding GAAP tag and a version of the tag without the GAAP keyword in case the company has only file non-gaap

Functions

func CikPageParser

func CikPageParser(page io.Reader) (string, error)

func FilingPageParser

func FilingPageParser(page io.Reader, _ FilingType) map[string][]Document

The filing page parser - The top of the page has a list of reports. - Get all the reports (link to all the reports) and put it in an array - The Accordian on the side of the page identifies what each report is - Get the text of the accordian and map the type of the report to the report - Create a map of the report to report link

func Init

func Init(param InitParams)

func ParseAllReports

func ParseAllReports(cik string, an string) []int

parseAllReports gets all the reports filed under a given account normalizeNumber

func ParseCikAndDocID

func ParseCikAndDocID(url string) (string, string)

func ParseFilingScale

func ParseFilingScale(z *html.Tokenizer, docType string) (map[unitEntity]unit, []string, []string)

func ParseHyperLinkTag

func ParseHyperLinkTag(z *html.Tokenizer, token html.Token) (string, string)

func ParseKeyName

func ParseKeyName(data string) string

func ParseTableData

func ParseTableData(z *html.Tokenizer, parseHref bool) (string, string)

func ParseTableHeading

func ParseTableHeading(z *html.Tokenizer) ([]string, error)

func ParseTableRow

func ParseTableRow(useCase string, z *html.Tokenizer, parseHref bool) ([]string, []string, error)

func ParseTableTitle

func ParseTableTitle(z *html.Tokenizer) []string

func QueryPageParser

func QueryPageParser(page io.Reader, docType FilingType) map[string]string

This is the parsing of query page where we get the list of filings of a given types ex: https://www.sec.gov/cgi-bin/browse-edgar?CIK=AAPL&owner=exclude&action=getcompany&type=10-Q&count=1&dateb= Assumptions of the parser: - There is interactive data available and there is a button that allows the user to click it - Since it is a link the tag will be a hyperlink with a button with the id=interactiveDataBtn - The actual link is the href attribute in the "a" token just before the id attribute

Types

type Company

type Company struct {
	sync.Mutex
	Company string `json:"Company"`

	FilingLinks map[FilingType]map[string]string  `json:"-"`
	Reports     map[FilingType]map[string]*Report `json:"Financial Reports"`
	// contains filtered or unexported fields
}

func (*Company) AddReport

func (c *Company) AddReport(file *Report)

func (*Company) AvailableFilings

func (c *Company) AvailableFilings(filingType FilingType) []time.Time

func (*Company) CIK

func (c *Company) CIK() string

func (*Company) Filing

func (c *Company) Filing(fileType FilingType, ts time.Time) (Filing, error)

func (*Company) Filings

func (c *Company) Filings(fileType FilingType, ts ...time.Time) ([]Filing, error)

Get multiple filings in parallel

func (*Company) SaveFolder

func (c *Company) SaveFolder(w io.Writer) error

Save the Company folder into the writer in JSON format

func (*Company) String

func (c *Company) String() string

func (*Company) Ticker

func (c *Company) Ticker() string

type CompanyFolder

type CompanyFolder interface {

	// Ticker gets the ticker of this company
	Ticker() string

	// CIK gets the CIK assigned to the company
	CIK() string

	// AvailableFilings gets the list of dates of available filings
	AvailableFilings(FilingType) []time.Time

	// Filing gets a filing given a filing type and date of filing.
	Filing(FilingType, time.Time) (Filing, error)

	// Filings gets a list of filings. Parallel fetch.
	Filings(FilingType, ...time.Time) ([]Filing, error)

	// SaveFolder persists the data from the company folder into a writer
	// provided by the user. This stored info can be presented back to
	// the fetcher (using CreateFolder API in fetcher) to recreate the
	// company folder with already parsed data
	SaveFolder(w io.Writer) error

	// String is a dump routine to view the contents of the folder
	String() string
}

CompanyFolder interface used to get filing information about a company

type DocValues

type DocValues struct {
	Periods  []string
	Section  []Section
	EndDates []string
	Scales   map[unitEntity]unit
}

type Document

type Document struct {
	Category   string
	Name       string
	Type       string
	Keys       []string
	NotKeys    []string
	URL        string
	Condition  string
	IsRequired bool
}

type Filing

type Filing interface {
	Ticker() string
	FiledOn() time.Time
	Type() (FilingType, error)
	ShareCount() (float64, error)
	Revenue() (float64, error)
	CostOfRevenue() (float64, error)
	GrossMargin() (float64, error)
	OperatingIncome() (float64, error)
	OperatingExpense() (float64, error)
	NetIncome() (float64, error)
	TotalEquity() (float64, error)
	ShortTermDebt() (float64, error)
	LongTermDebt() (float64, error)
	CurrentLiabilities() (float64, error)
	CurrentAssets() (float64, error)
	DeferredRevenue() (float64, error)
	RetainedEarnings() (float64, error)
	OperatingCashFlow() (float64, error)
	CapitalExpenditure() (float64, error)
	Dividend() (float64, error)
	WAShares() (float64, error)
	DividendPerShare() (float64, error)
	Interest() (float64, error)
	Cash() (float64, error)
	Securities() (float64, error)
	Goodwill() (float64, error)
	Intangibles() (float64, error)
	Assets() (float64, error)
	Liabilities() (float64, error)
	CollectedData() []string
}

Filing interface for fetching financial data from a collected filing

type FilingFetcher

type FilingFetcher interface {

	// CompanyFolder creates a folder for the company with a list of
	// available filings. No financial data is pulled and the user
	// of the interface can selectively pull financial data into the
	// folder using the CompanyFolder interface
	CompanyFolder(string, ...FilingType) (CompanyFolder, error)

	// CreateFolder creates a company folder using a Reader
	// User can provoder a store of edgar data previous stored
	// by this package (using the Store function of the Company Folder)
	// This function is used to avoid reparsing edgar data and reusing
	// already parsed and stored information.
	CreateFolder(io.Reader, ...FilingType) (CompanyFolder, error)
}

FilingFetcher fetches the filing requested

func NewFilingFetcher

func NewFilingFetcher() FilingFetcher

NewFilingFetcher creates a new empty filing fetcher

type FilingType

type FilingType string

FilingType is the type definition of various filing types

const FilingType10K FilingType = "10-K"

FilingType10K is a 10-K annual filing of a company with the SEC

const FilingType10Q FilingType = "10-Q"

FilingType10Q is a 10-Q quarterly filing of a company with the SEC

type FinancialReport

type FinancialReport struct {
	Title     string                 `json:"Title"`
	DocValues map[string][]DocValues `json:"DocValue"`
	DocType   FilingType             `json:"Filing Type"`
	Entity    *entityData            `json:"Entity Information"`
	Ops       *opsData               `json:"Operational Information"`
	Bs        *bsData                `json:"Balance Sheet Information"`
	Cf        *cfData                `json:"Cash Flow Information"`
}

func FinReportParser

func FinReportParser(page io.Reader, fr *FinancialReport, docType string) (*FinancialReport, error)

func ParseMappedReports

func ParseMappedReports(typeDocs map[string][]Document, filingType FilingType) (*FinancialReport, error)

func (FinancialReport) String

func (f FinancialReport) String() string

type InitParams

type InitParams struct {
	MenuCategories []MenuCategory
	CategoryDocs   map[string][]Document
	RequiredDocs   []Document
	RestrictedTags map[string]bool
	XBRLTags       map[string]string
}
type MenuCategory struct {
	Name      string
	Keys      []string
	NotKeys   []string
	Condition string
}

type Report

type Report struct {
	Company string           `json:"Company"`
	Date    Timestamp        `json:"Report date"`
	FinData *FinancialReport `json:"Financial Data"`

	// DocumentType/Document
	Documents map[string][]Document `json:"Documents"`
}

func (*Report) Assets

func (f *Report) Assets() (float64, error)

func (*Report) CapitalExpenditure

func (f *Report) CapitalExpenditure() (float64, error)

func (*Report) Cash

func (f *Report) Cash() (float64, error)

func (*Report) CollectedData

func (f *Report) CollectedData() []string

func (*Report) CostOfRevenue

func (f *Report) CostOfRevenue() (float64, error)

func (*Report) CurrentAssets

func (f *Report) CurrentAssets() (float64, error)

func (*Report) CurrentLiabilities

func (f *Report) CurrentLiabilities() (float64, error)

func (*Report) DeferredRevenue

func (f *Report) DeferredRevenue() (float64, error)

func (*Report) Dividend

func (f *Report) Dividend() (float64, error)

func (*Report) DividendPerShare

func (f *Report) DividendPerShare() (float64, error)

func (*Report) FiledOn

func (f *Report) FiledOn() time.Time

func (*Report) Goodwill

func (f *Report) Goodwill() (float64, error)

func (*Report) GrossMargin

func (f *Report) GrossMargin() (float64, error)

func (*Report) Intangibles

func (f *Report) Intangibles() (float64, error)

func (*Report) Interest

func (f *Report) Interest() (float64, error)

func (*Report) Liabilities

func (f *Report) Liabilities() (float64, error)

func (*Report) LongTermDebt

func (f *Report) LongTermDebt() (float64, error)

func (*Report) NetIncome

func (f *Report) NetIncome() (float64, error)

func (*Report) OperatingCashFlow

func (f *Report) OperatingCashFlow() (float64, error)

func (*Report) OperatingExpense

func (f *Report) OperatingExpense() (float64, error)

func (*Report) OperatingIncome

func (f *Report) OperatingIncome() (float64, error)

func (*Report) RetainedEarnings

func (f *Report) RetainedEarnings() (float64, error)

func (*Report) Revenue

func (f *Report) Revenue() (float64, error)

func (*Report) Securities

func (f *Report) Securities() (float64, error)

func (*Report) ShareCount

func (f *Report) ShareCount() (float64, error)

func (*Report) ShortTermDebt

func (f *Report) ShortTermDebt() (float64, error)

func (Report) String

func (f Report) String() string

func (*Report) Ticker

func (f *Report) Ticker() string

func (*Report) TotalEquity

func (f *Report) TotalEquity() (float64, error)

func (*Report) Type

func (f *Report) Type() (FilingType, error)

func (*Report) WAShares

func (f *Report) WAShares() (float64, error)

type Row

type Row struct {
	Index  int
	Name   string
	Key    string
	Type   string
	Values []string
}

type Section

type Section struct {
	Name  string
	Key   string
	Index int
	Unit  int
	Rows  []Row
}

type Timestamp

type Timestamp time.Time

Timestamp is the edgar package representation of time.Time

func (Timestamp) MarshalJSON

func (t Timestamp) MarshalJSON() ([]byte, error)

MarshalJSON marshals Timestamp in a specific format for JSON marsahlling

func (Timestamp) String

func (t Timestamp) String() string

func (*Timestamp) UnmarshalJSON

func (t *Timestamp) UnmarshalJSON(b []byte) error

UnmarshalJSON unmarshals Timestamp in a specific format for JSON unmarshal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL