linguist

package
v0.12.0-rc2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 26, 2018 License: MIT Imports: 17 Imported by: 23

README

linguist

Go port of github linguist.

Updating linguist

To update to the latest version of linguist, run

git clone https://github.com/github/linguist data/linguist
go generate .
go generate ./data
rm -rf data/linguist

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Analyse

func Analyse(contents []byte, hints []string) (language string)

Analyse returns the name of a programming language, or the empty string if one could not be determined.

Uses Naive Bayesian Classification on the file contents provided.

It is recommended to use LanguageByContents() instead of this function directly.

Obtain hints from LanguageHints()

NOTE(tso): May yield inaccurate results

func IsBinary

func IsBinary(contents []byte) bool

IsBinary checks contents for known character escape codes which frequently show up in binary files but rarely (if ever) in text.

Use this check before using LanguageFromContents to reduce likelihood of passing binary data into it which can cause inaccurate results.

func IsDocumentation

func IsDocumentation(path string) bool

IsDocumentation checks if path contains a filename commonly belonging to documentation.

func IsVendored

func IsVendored(path string) bool

IsVendored checks if path contains a filename commonly belonging to configuration files.

func LanguageByContents

func LanguageByContents(contents []byte, hints []string) string

LanguageByContents attempts to detect the language of a source file based on its contents and a slice of hints to the possible answer.

Obtain hints with LanguageHints()

Returns the empty string a language could not be determined.

func LanguageByFilename

func LanguageByFilename(filename string) string

LanguageByFilename attempts to determine the language of a source file based solely on common naming conventions and file extensions from the languages.yml file provided by https://github.com/github/linguist

Returns the empty string in ambiguous or unrecognized cases.

func LanguageColor

func LanguageColor(language string) string

LanguageColor is a convenience function that returns the color associated with the language, in HTML Hex notation (e.g. "#123ABC") from the languages.yml file provided by https://github.com/github/linguist

Returns the empty string if there is no associated color for the language.

func LanguageHints

func LanguageHints(filename string) (hints []string)

LanguageHints attempts to detect all possible languages of a source file based solely on common naming conventions and file extensions from the languages.yml file provided by https://github.com/github/linguist

Intended to be used with LanguageByContents.

May return an empty slice.

func ShouldIgnoreContents

func ShouldIgnoreContents(contents []byte) bool

ShouldIgnoreContents checks if contents should not be passed to LangugeByContents.

(this simply calls IsBinary)

func ShouldIgnoreFilename

func ShouldIgnoreFilename(filename string) bool

ShouldIgnoreFilename checks if filename should not be passed to LanguageByFilename.

(this simply calls IsVendored and IsDocumentation)

Types

type Language

type Language struct {
	Language string  `json:"language"`
	Percent  float64 `json:"percent"`
	// Color represents the color associated with the language in HTML hex notation.
	Color string `json:"color"`
}

Language is the programming langage and the percentage on how sure linguist feels about its decision.

func Alias

func Alias(lang *Language) *Language

Alias returns the language name for a given known alias.

Occasionally linguist comes up with odd language names, or determines a Java app as a "Maven POM" app, which in essence is the same thing for Draft's intent.

func ProcessDir

func ProcessDir(dirname string) ([]*Language, error)

ProcessDir walks through a directory and returns a list of sorted languages within that directory.

Directories

Path Synopsis
Package tokenizer is a go port of https://github.com/github/linguist/blob/master/lib/linguist/tokenizer.rb in their words: # Generic programming language tokenizer.
Package tokenizer is a go port of https://github.com/github/linguist/blob/master/lib/linguist/tokenizer.rb in their words: # Generic programming language tokenizer.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL