processor

package
v2.10.1+incompatible Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 12, 2019 License: MIT, Unlicense Imports: 25 Imported by: 0

Documentation

Index

Constants

View Source
const (
	TString int = iota + 1
	TSlcomment
	TMlcomment
	TComplexity
)

Used by trie structure to store the types

View Source
const (
	SBlank             int64 = 1
	SCode              int64 = 2
	SComment           int64 = 3
	SCommentCode       int64 = 4 // Indicates comment after code
	SMulticomment      int64 = 5
	SMulticommentCode  int64 = 6 // Indicates multi comment after code
	SMulticommentBlank int64 = 7 // Indicates multi comment ended with blank afterwards
	SString            int64 = 8
	SDocString         int64 = 9
)

The below are used as identifiers for the code state machine

View Source
const SheBang string = "#!"

Variables

View Source
var AllowListExtensions = []string{}

AllowListExtensions is a list of extensions which are allowed to be processed

View Source
var AverageWage int64 = 56286

AverageWage is the average wage in dollars used for the COCOMO cost estimate

View Source
var ByteOrderMarks = [][]byte{
	{254, 255},
	{255, 254},
	{0, 0, 254, 255},
	{255, 254, 0, 0},
	{43, 47, 118, 56},
	{43, 47, 118, 57},
	{43, 47, 118, 43},
	{43, 47, 118, 47},
	{43, 47, 118, 56, 45},
	{247, 100, 76},
	{221, 115, 102, 115},
	{14, 254, 255},
	{251, 238, 40},
	{132, 49, 149, 51},
}

Taken from https://en.wikipedia.org/wiki/Byte_order_mark#Byte_order_marks_by_encoding These indicate that we cannot count the file correctly so we can at least warn the user

View Source
var Ci = false

Indicates if running inside a CI so to disable box drawing characters

View Source
var Cocomo = false

Cocomo toggles the COCOMO calculation

View Source
var Complexity = false

Complexity toggles complexity calculation

View Source
var Debug = false

Debug enables debug logging output

View Source
var DirFilePaths = []string{}

DirFilePaths is not set via flags but by arguments following the flags for file or directory to process

View Source
var DisableCheckBinary = false

DisableCheckBinary toggles checking for binary files using NUL bytes

View Source
var Duplicates = false

Duplicates enables duplicate file detection

View Source
var Exclude = []string{}

Exclude is a regular expression which is used to exclude files from being processed

View Source
var ExtensionToLanguage = map[string][]string{}

ExtensionToLanguage is loaded from the JSON that is in constants.go

View Source
var FileListQueueSize = runtime.NumCPU()

FileListQueueSize is the queue of files found and ready to be read into memory

View Source
var FileOutput = ""

FileOutput sets the file that output should be written to

View Source
var FileProcessJobWorkers = runtime.NumCPU() * 4

FileProcessJobWorkers is the number of workers that process the file collecting stats

View Source
var FileReadContentJobQueueSize = runtime.NumCPU()

FileReadContentJobQueueSize is a queue of files ready to be processed

View Source
var FileReadJobWorkers = runtime.NumCPU() * 4

FileReadJobWorkers is the number of processes that read files off disk into memory

View Source
var FileSummaryJobQueueSize = runtime.NumCPU()

FileSummaryJobQueueSize is the queue used to hold processed file statistics before formatting

View Source
var FilenameToLanguage = map[string]string{}

Similar to ExtensionToLanguage loaded from the JSON in constants.go

View Source
var Files = false

Files indicates if there should be file output or not when formatting

View Source
var Format = ""

Format sets the output format of the formatter

View Source
var GcFileCount = 10000

GcFileCount is the number of files to process before turning the GC back on

View Source
var GitIgnore = false

Disables .gitignore checks

View Source
var Ignore = false

Disables ignore file checks

View Source
var IgnoreMinifiedGenerate = false

Ignore printing counts for minified/generated files

View Source
var LanguageFeatures = map[string]LanguageFeature{}

LanguageFeatures contains the processed languages from processLanguageFeature

View Source
var LanguageFeaturesMutex = sync.Mutex{}

LanguageFeaturesMutex is the shared mutex used to control getting and setting of language features used rather than sync.Map because it turned out to be marginally faster

View Source
var Languages = false

Languages indicates if the command line should print out the supported languages

View Source
var LargeByteCount int64 = 1000000

Number of bytes before being counted as a large file based on https://github.com/pinpt/ripsrc/blob/master/ripsrc/fileinfo/fileinfo.go#L44

View Source
var LargeLineCount int64 = 40000

Number of lines before being counted as a large file based on https://github.com/pinpt/ripsrc/blob/master/ripsrc/fileinfo/fileinfo.go#L44

View Source
var MinifiedGenerated = false

MinifiedGenerated enables minified/generated file detection

View Source
var MinifiedGeneratedLineByteLength = 255

Number of bytes per average line to determine file is minified/generated

View Source
var More = false

More enables wider output with more information in formatter

View Source
var NoLarge = false

If set true will ignore files over a certain number of lines or bytes

View Source
var PathDenyList = []string{}

PathDenyList sets the paths that should be skipped

View Source
var ShebangLookup = map[string][]string{}

Loaded from the JSON in constants.go contains shebang lookups

View Source
var SortBy = ""

SortBy sets which column output in formatter should be sorted by

View Source
var Trace = false

Trace enables trace logging output which is extremely verbose

View Source
var Verbose = false

Verbose enables verbose logging output

View Source
var Version = "2.10.1"

The version of the application

Functions

func ConfigureGc added in v1.10.0

func ConfigureGc()

ConfigureGc needs to be set outside of ProcessConstants because it should only be enabled in command line mode https://github.com/boyter/scc/issues/32

func ConfigureLazy

func ConfigureLazy(lazy bool)

ConfigureLazy is a simple setter used to turn on lazy loading used only by command line

func CountStats added in v1.4.0

func CountStats(fileJob *FileJob)

CountStats will process the fileJob If the file contains anything even just a newline its line count should be >= 1. If the file has a size of 0 its line count should be 0. Newlines belong to the line they started on so a file of \n means only 1 line This is the 'hot' path for the application and needs to be as fast as possible

func DetectLanguage

func DetectLanguage(name string) ([]string, string)

Detects a language based on the filename returns the language extension and error

func DetectSheBang

func DetectSheBang(content string) (string, error)

Given some content attempt to determine if it has a #! that maps to a known language and return the language

func DetermineLanguage

func DetermineLanguage(filename string, fallbackLanguage string, possibleLanguages []string, content []byte) string

Given a filename, fallback language, possible languages and content make a guess to the type. If multiple possible it will guess based on keywords similar to how https://github.com/vmchale/polyglot does

func EstimateCost

func EstimateCost(effortApplied float64, averageWage int64) float64

EstimateCost calculates the cost in dollars applied using generic COCOMO2 weighted values based on the average yearly wage

func EstimateEffort

func EstimateEffort(sloc int64) float64

EstimateEffort calculate the effort applied using generic COCOMO2 weighted values

func EstimateScheduleMonths

func EstimateScheduleMonths(effortApplied float64) float64

EstimateScheduleMonths estimates the effort in months based on the result from EstimateEffort

func LoadLanguageFeature

func LoadLanguageFeature(loadName string)

LoadLanguageFeature will load a single feature as requested given the name

func Process

func Process()

Process is the main entry point of the command line it sets everything up and starts running

func ProcessConstants added in v1.4.0

func ProcessConstants()

ProcessConstants is responsible for setting up the language features based on the JSON file that is stored in constants Needs to be called at least once in order for anything to actually happen

Types

type CheckDuplicates

type CheckDuplicates struct {
	// contains filtered or unexported fields
}

CheckDuplicates is used to hold hashes if duplicate detection is enabled it comes with a mutex that should be locked while a check is being performed then added

func (*CheckDuplicates) Add

func (c *CheckDuplicates) Add(key int64, hash []byte)

Non thread safe add a key into the duplicates check need to use mutex inside struct before calling this

func (*CheckDuplicates) Check

func (c *CheckDuplicates) Check(key int64, hash []byte) bool

Non thread safe check to see if the key exists already need to use mutex inside struct before calling this

type DirectoryJob

type DirectoryJob struct {
	// contains filtered or unexported fields
}

type DirectoryWalker

type DirectoryWalker struct {
	// contains filtered or unexported fields
}

func NewDirectoryWalker

func NewDirectoryWalker(output chan<- *FileJob) *DirectoryWalker

func (*DirectoryWalker) Readdir

func (dw *DirectoryWalker) Readdir(path string) ([]os.FileInfo, error)

func (*DirectoryWalker) Run

func (dw *DirectoryWalker) Run()

func (*DirectoryWalker) Start

func (dw *DirectoryWalker) Start(root string) error

func (*DirectoryWalker) Walk

func (dw *DirectoryWalker) Walk(handle *cuba.Handle)

type FileJob

type FileJob struct {
	Language           string
	PossibleLanguages  []string // Used to hold potentially more than one language which populates language when determined
	Filename           string
	Extension          string
	Location           string
	Content            []byte
	Bytes              int64
	Lines              int64
	Code               int64
	Comment            int64
	Blank              int64
	Complexity         int64
	WeightedComplexity float64
	Hash               []byte
	Callback           FileJobCallback
	Binary             bool
	Minified           bool
}

FileJob is a struct used to hold all of the results of processing internally before sent to the formatter

type FileJobCallback added in v1.4.0

type FileJobCallback interface {
	// ProcessLine should return true to continue processing or false to stop further processing and return
	ProcessLine(job *FileJob, currentLine int64, lineType LineType) bool
}

FileJobCallback is an interface that FileJobs can implement to get a per line callback with the line type

type Language

type Language struct {
	LineComment      []string   `json:"line_comment"`
	ComplexityChecks []string   `json:"complexitychecks"`
	Extensions       []string   `json:"extensions"`
	ExtensionFile    bool       `json:"extensionFile"`
	MultiLine        [][]string `json:"multi_line"`
	Quotes           []Quote    `json:"quotes"`
	NestedMultiLine  bool       `json:"nestedmultiline"`
	Keywords         []string   `json:"keywords"`
	FileNames        []string   `json:"filenames"`
	SheBangs         []string   `json:"shebangs"`
}

Language is a struct which contains the values for each language stored in languages.json

type LanguageFeature

type LanguageFeature struct {
	Complexity            *Trie
	MultiLineComments     *Trie
	SingleLineComments    *Trie
	Strings               *Trie
	Tokens                *Trie
	Nested                bool
	ComplexityCheckMask   byte
	SingleLineCommentMask byte
	MultiLineCommentMask  byte
	StringCheckMask       byte
	ProcessMask           byte
	Keywords              []string
	Quotes                []Quote
}

LanguageFeature is a struct which represents the conversion from Language into what is used for matching

type LanguageReportEnd

type LanguageReportEnd struct {
	Sum summaryStruct `yaml:"SUM"`
}

type LanguageReportStart

type LanguageReportStart struct {
	Header headerStruct
}

type LanguageSummary

type LanguageSummary struct {
	Name               string
	Bytes              int64
	Lines              int64
	Code               int64
	Comment            int64
	Blank              int64
	Complexity         int64
	Count              int64
	WeightedComplexity float64
	Files              []*FileJob
}

LanguageSummary is used to hold summarised results for a single language

type LineType added in v1.4.0

type LineType int32

LineType what type of line are are processing

const (
	LINE_BLANK LineType = iota
	LINE_CODE
	LINE_COMMENT
)

These are not meant to be CAMEL_CASE but as it us used by an external project we cannot change it

type OpenClose

type OpenClose struct {
	Open  []byte
	Close []byte
}

OpenClose is used to hold an open/close pair for matching such as multi line comments

type Quote

type Quote struct {
	Start        string `json:"start"`
	End          string `json:"end"`
	IgnoreEscape bool   `json:"ignoreEscape"` // To enable turning off the \ check for C# @"\" string examples https://github.com/boyter/scc/issues/71
	DocString    bool   `json:"docString"`    // To enable docstring check for Python where "If the triple quote string starts following a newline with only white-space characters in front and ends followed by only a newline or white-space characters it is a comment" https://github.com/boyter/scc/issues/62
}

Quote is a struct which holds rules and start/end values for string quotes

type Trie added in v1.10.0

type Trie struct {
	Type  int
	Close []byte
	Table [256]*Trie
}

Trie is a structure used to store matches efficiently

func (*Trie) Insert added in v1.10.0

func (root *Trie) Insert(tokenType int, token []byte)

Insert inserts a string into the trie for matching

func (*Trie) InsertClose added in v1.10.0

func (root *Trie) InsertClose(tokenType int, openToken, closeToken []byte)

InsertClose closes off a string in the trie

func (*Trie) Match added in v1.10.0

func (root *Trie) Match(token []byte) (int, int, []byte)

Match checks the created trie structure for a match

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL