attribute

package
v0.0.0-...-53ff736 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 27, 2024 License: Apache-2.0 Imports: 3 Imported by: 0

Documentation

Index

Constants

View Source
const (
	ClassBytesTerm         = "BytesTerm"
	ClassCharTerm          = "CharTerm"
	ClassOffset            = "Offset"
	ClassPositionIncrement = "PositionIncrement"
	ClassPayload           = "Payload"
	ClassPositionLength    = "PositionLength"
	ClassTermFrequency     = "TermFrequency"
	ClassTermToBytesRef    = "TermToBytesRef"
	ClassType              = "Type"
)
View Source
const (
	DEFAULT_TYPE = "word"
)

Variables

This section is empty.

Functions

This section is empty.

Types

type Attribute

type Attribute interface {
	Interfaces() []string
	Reset() error
	CopyTo(target Attribute) error
	Clone() Attribute
}

Attribute Base class for Attributes that can be added to a AttributeSourceV2. Attributes are used to add data in a dynamic, yet types-safe way to a source of usually streamed objects,

type BytesTermAttr

type BytesTermAttr interface {
	Attribute

	GetBytes() []byte

	// SetBytes
	// Sets the bytes of the term
	SetBytes(bytes []byte) error

	Reset() error
}

BytesTermAttr This attribute can be used if you have the raw term bytes to be indexed. It can be used as replacement for CharTermAttr, if binary terms should be indexed.

type CharTermAttr

type CharTermAttr interface {
	Attribute

	GetBytes() []byte

	// GetString
	// Returns the internal termBuffer character array which you can then directly alter. If the array is
	// too small for your token, use resizeBuffer(int) to increase it. After altering the buffer be sure to call
	// setLength to record the number of valid characters that were placed into the termBuffer.
	GetString() string

	// AppendString
	// Appends the specified String to this character sequence.
	// The characters of the String argument are appended, in order, increasing the length of this sequence by the
	// length of the argument. If argument is null, then the four characters "null" are appended.
	AppendString(s string) error

	AppendRune(r rune) error

	// Reset
	// Sets the length of the termBuffer to zero. Use this method before appending contents
	// using the Appendable interface.
	Reset() error
}

CharTermAttr The term text of a Token.

type DefaultAttributeFactory

type DefaultAttributeFactory struct {
}

func (DefaultAttributeFactory) CreateAttributeInstance

func (d DefaultAttributeFactory) CreateAttributeInstance(class string) (Attribute, error)

type Factory

type Factory interface {
	// CreateAttributeInstance Returns an AttributeImpl for the supplied Attribute interface class.
	CreateAttributeInstance(class string) (Attribute, error)
}
var (
	DEFAULT_ATTRIBUTE_FACTORY Factory = &DefaultAttributeFactory{}
)

type OffsetAttr

type OffsetAttr interface {
	// StartOffset
	// Returns this Token's starting offset, the position of the first character corresponding
	// to this token in the source text.
	// Note that the difference between endOffset() and startOffset() may not be equal to termText.length(),
	// as the term text may have been altered by a stemmer or some other filter.
	// See Also: setOffset(int, int)
	StartOffset() int

	// EndOffset
	// Returns this Token's ending offset, one greater than the position of the last character
	// corresponding to this token in the source text. The length of the token in the source text
	// is (endOffset() - startOffset()).
	// See Also: setOffset(int, int)
	EndOffset() int

	// SetOffset
	// Set the starting and ending offset.
	// Throws: IllegalArgumentException – If startOffset or endOffset are negative, or if startOffset is
	// greater than endOffset
	// See Also: startOffset(), endOffset()
	SetOffset(startOffset, endOffset int) error
}

OffsetAttr The start and end character offset of a Token.

type PayloadAttr

type PayloadAttr interface {
	Attribute

	// GetPayload Returns this Token's payload.
	// See Also: setPayload(BytesRef)
	GetPayload() []byte

	// SetPayload Sets this Token's payload.
	// See Also: getPayload()
	SetPayload(payload []byte) error

	Reset() error
}

PayloadAttr The payload of a Token. The payload is stored in the index at each position, and can be used to influence scoring when using Payload-based queries. NOTE: because the payload will be stored at each position, it's usually best to use the minimum number of bytes necessary. Some codec implementations may optimize payload storage when all payloads have the same length. See Also: org.apache.lucene.index.PostingsEnum

type PositionIncrAttr

type PositionIncrAttr interface {

	// SetPositionIncrement
	// Set the position increment. The default value is one.
	// positionIncrement: the distance from the prior term
	SetPositionIncrement(positionIncrement int) error

	// GetPositionIncrement
	// Returns the position increment of this Token.
	GetPositionIncrement() int
}

PositionIncrAttr Determines the position of this token relative to the previous Token in a TokenStream, used in phrase searching. The default value is one. Some common uses for this are:

  • Set it to zero to put multiple terms in the same position. This is useful if, e.g., a word has multiple stems. Searches for phrases including either stem will match. In this case, all but the first stem's increment should be set to zero: the increment of the first instance should be one. Repeating a token with an increment of zero can also be used to boost the scores of matches on that token.
  • Set it to values greater than one to inhibit exact phrase matches. If, for example, one does not want phrases to match across removed stop words, then one could build a stop word filter that removes stop words and also sets the increment to the number of stop words removed before each non-stop word. Then exact phrase queries will only match when the terms occur with no intervening stop words.

See Also: org.apache.lucene.index.PostingsEnum

type PositionLengthAttr

type PositionLengthAttr interface {
	// SetPositionLength
	// Set the position length of this Token.
	// The default value is one.
	// Params: positionLength – how many positions this token spans.
	// Throws: IllegalArgumentException – if positionLength is zero or negative.
	// See Also: getPositionLength()
	SetPositionLength(positionLength int) error

	// GetPositionLength
	// Returns the position length of this Token.
	// See Also: setPositionLength
	GetPositionLength() int
}

type Source

type Source struct {
	// contains filtered or unexported fields
}

func NewSource

func NewSource() *Source

func (*Source) BytesTerm

func (r *Source) BytesTerm() BytesTermAttr

func (*Source) CharTerm

func (r *Source) CharTerm() CharTermAttr

func (*Source) Offset

func (r *Source) Offset() OffsetAttr

func (*Source) PackedTokenAttribute

func (r *Source) PackedTokenAttribute() PackedTokenAttr

func (*Source) Payload

func (r *Source) Payload() PayloadAttr

func (*Source) PositionIncrement

func (r *Source) PositionIncrement() PositionIncrAttr

func (*Source) PositionLength

func (r *Source) PositionLength() PositionLengthAttr

func (*Source) Reset

func (r *Source) Reset() error

func (*Source) Term2Bytes

func (r *Source) Term2Bytes() Term2BytesAttr

func (*Source) TermFrequency

func (r *Source) TermFrequency() TermFreqAttr

func (*Source) Type

func (r *Source) Type() TypeAttr

type Term2BytesAttr

type Term2BytesAttr interface {
	Attribute

	GetBytes() []byte
}

Term2BytesAttr This attribute is requested by TermsHashPerField to index the contents. This attribute can be used to customize the final byte[] encoding of terms. Consumers of this attribute call getBytesRef() for each term.

type TermFreqAttr

type TermFreqAttr interface {

	// SetTermFrequency
	// Set the custom term frequency of the current term within one document.
	SetTermFrequency(termFrequency int) error

	// GetTermFrequency
	// Returns the custom term frequency.
	GetTermFrequency() int
}

TermFreqAttr Sets the custom term frequency of a term within one document. If this attribute is present in your analysis chain for a given field, that field must be indexed with IndexOptions.DOCS_AND_FREQS.

type TypeAttr

type TypeAttr interface {

	// Type
	// Returns this Token's lexical types. Defaults to "word".
	// See Also: setType(String)
	Type() string

	// SetType
	// Set the lexical types.
	// See Also: types()
	SetType(_type string)
}

TypeAttr A Token's lexical types. The Default value is "word".

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL