tokenizer

package
v1.7.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 11, 2019 License: Apache-2.0 Imports: 2 Imported by: 0

Documentation

Overview

Package tokenizer implements file tokenization used by the enry content classifier. This package is an implementation detail of enry and should not be imported by other packages.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Tokenize

func Tokenize(content []byte) []string

Tokenize returns language-agnostic lexical tokens from content. The tokens returned should match what the Linguist library returns. At most the first 100KB of content are tokenized.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL