Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Tokenizer ¶
type Tokenizer struct {
// contains filtered or unexported fields
}
Tokenizer is a Sentence Piece tokenizer.
func NewFromModelFolder ¶
NewFromModelFolder returns a new Tokenizer.
func (*Tokenizer) Detokenize ¶
Detokenize flatten and merges a list of tokens into a single string.
func (*Tokenizer) IDsToTokens ¶
IDsToTokens returns a list of string terms from a list of token IDs. It panics if a token is not found in the vocabulary.
func (*Tokenizer) TokensToIDs ¶
TokensToIDs returns a list of token IDs from a list of string tokens. It panics if a token is not found in the vocabulary and no unknown token is found.
Directories ¶
Path | Synopsis |
---|---|
internal
|
|
sentencepiece
Package sentencepiece implements the SentencePiece encoder (Kudo and Richardson, 2018).
|
Package sentencepiece implements the SentencePiece encoder (Kudo and Richardson, 2018). |
Click to show internal directories.
Click to hide internal directories.