token

package
v0.0.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 17, 2021 License: Apache-2.0 Imports: 2 Imported by: 0

Documentation

Overview

Package token tokenizes data for dupi.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Config

type Config struct {
	Name string
}

Config describes tokenizer configurations.

func DefaultConfig

func DefaultConfig() *Config

DefaultConfig returns the default tokenizer config for dupy.

type T

type T struct {
	Tag Tag
	Lit []byte
	Pos uint32
}

Type T represents a token.

func Tokenize

func Tokenize(dst []T, d []byte, offset uint32) []T

Tokenize is a tokenizer function.

func (*T) String

func (t *T) String() string

type Tag

type Tag int

Tag represents a value in an enumeration of values to associate with a token.

const (
	Word Tag = 35 + iota
	Other
	Eod
)

func (Tag) String

func (t Tag) String() string

type TokenizerFunc

type TokenizerFunc func(dst []T, dat []byte, offset uint32) []T

TokenizerFunc is the type of a function used for tokenizing document data.

func FromConfig

func FromConfig(cfg *Config) (TokenizerFunc, error)

FromConfig attempts to create a tokenizer function from a configuration.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL