Documentation ¶
Index ¶
- func Analyse(contents []byte, hints []string) (language string)
- func IsBinary(contents []byte) bool
- func IsDocumentation(path string) bool
- func IsVendored(path string) bool
- func LanguageByContents(contents []byte, hints []string) string
- func LanguageByFilename(filename string) string
- func LanguageColor(language string) string
- func LanguageHints(filename string) (hints []string)
- func ShouldIgnoreContents(contents []byte) bool
- func ShouldIgnoreFilename(filename string) bool
- type Language
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Analyse ¶ added in v0.10.0
Analyse returns the name of a programming language, or the empty string if one could not be determined.
Uses Naive Bayesian Classification on the file contents provided.
It is recommended to use LanguageByContents() instead of this function directly.
Obtain hints from LanguageHints()
NOTE(tso): May yield inaccurate results
func IsBinary ¶ added in v0.10.0
IsBinary checks contents for known character escape codes which frequently show up in binary files but rarely (if ever) in text.
Use this check before using LanguageFromContents to reduce likelihood of passing binary data into it which can cause inaccurate results.
func IsDocumentation ¶ added in v0.10.0
IsDocumentation checks if path contains a filename commonly belonging to documentation.
func IsVendored ¶ added in v0.10.0
IsVendored checks if path contains a filename commonly belonging to configuration files.
func LanguageByContents ¶ added in v0.10.0
LanguageByContents attempts to detect the language of a source file based on its contents and a slice of hints to the possible answer.
Obtain hints with LanguageHints()
Returns the empty string a language could not be determined.
func LanguageByFilename ¶ added in v0.10.0
LanguageByFilename attempts to determine the language of a source file based solely on common naming conventions and file extensions from the languages.yml file provided by https://github.com/github/linguist
Returns the empty string in ambiguous or unrecognized cases.
func LanguageColor ¶ added in v0.10.0
LanguageColor is a convenience function that returns the color associated with the language, in HTML Hex notation (e.g. "#123ABC") from the languages.yml file provided by https://github.com/github/linguist
Returns the empty string if there is no associated color for the language.
func LanguageHints ¶ added in v0.10.0
LanguageHints attempts to detect all possible languages of a source file based solely on common naming conventions and file extensions from the languages.yml file provided by https://github.com/github/linguist
Intended to be used with LanguageByContents.
May return an empty slice.
func ShouldIgnoreContents ¶ added in v0.10.0
ShouldIgnoreContents checks if contents should not be passed to LangugeByContents.
(this simply calls IsBinary)
func ShouldIgnoreFilename ¶ added in v0.10.0
ShouldIgnoreFilename checks if filename should not be passed to LanguageByFilename.
(this simply calls IsVendored and IsDocumentation)
Types ¶
type Language ¶
type Language struct { Language string `json:"language"` Percent float64 `json:"percent"` // Color represents the color associated with the language in HTML hex notation. Color string `json:"color"` }
Language is the programming langage and the percentage on how sure linguist feels about its decision.
func Alias ¶ added in v0.7.0
Alias returns the language name for a given known alias.
Occasionally linguist comes up with odd language names, or determines a Java app as a "Maven POM" app, which in essence is the same thing for Draft's intent.
func ProcessDir ¶
ProcessDir walks through a directory and returns a list of sorted languages within that directory.
Directories ¶
Path | Synopsis |
---|---|
Package tokenizer is a go port of https://github.com/github/linguist/blob/master/lib/linguist/tokenizer.rb in their words: # Generic programming language tokenizer.
|
Package tokenizer is a go port of https://github.com/github/linguist/blob/master/lib/linguist/tokenizer.rb in their words: # Generic programming language tokenizer. |