search

package
v0.0.0-...-c16d89f Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 5, 2024 License: Apache-2.0 Imports: 23 Imported by: 4

Documentation

Index

Constants

View Source
const (
	// OccurMust
	// Use this operator for clauses that must appear in the matching documents.
	// 等同于 AND
	OccurMust = Occur("+")

	// OccurFilter
	// Like OccurMust except that these clauses do not participate in scoring.
	OccurFilter = Occur("#")

	// OccurShould
	// Use this operator for clauses that should appear in the matching documents.
	// For a BooleanQuery with no OccurMust clauses one or more OccurShould clauses must match
	// a document for the BooleanQuery to match.
	// See Also: BooleanQuery.BooleanQueryBuilder.setMinimumNumberShouldMatch
	// 等同于 OR
	OccurShould = Occur("")

	// OccurMustNot
	// Use this operator for clauses that must not appear in the matching documents.
	// Note that it is not possible to search for queries that only consist of a OccurMustNot clause.
	// These clauses do not contribute to the score of documents.
	// 等同于 NOT
	OccurMustNot = Occur("-")
)
View Source
const (
	EQUAL_TO                 = TotalHitsRelation(iota) // The total hit count is equal to value.
	GREATER_THAN_OR_EQUAL_TO                           // The total hit count is greater than or equal to value.
)
View Source
const (
	ADVANCE_COST = 10
)
View Source
const (
	BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16
)
View Source
const (
	// DEFAULT_INTERVAL
	// we use 2^10-1 to check the remainder with a bitwise operation
	DEFAULT_INTERVAL = 0x3ff
)
View Source
const (
	TOTAL_HITS_THRESHOLD = 1000
)

Variables

View Source
var (
	// COMPLETE
	// Produced scorers will allow visiting all matches and get their score.
	COMPLETE = NewScoreMode(true, true)

	// COMPLETE_NO_SCORES
	// Produced scorers will allow visiting all matches but scores won't be available.
	COMPLETE_NO_SCORES = NewScoreMode(true, false)

	// TOP_SCORES
	// Produced scorers will optionally allow skipping over non-competitive hits using the Scorer.SetMinCompetitiveScore(float) API.
	TOP_SCORES = NewScoreMode(false, true)

	// TOP_DOCS
	// ScoreMode for top field collectors that can provide their own iterators, to optionally allow to skip for non-competitive docs
	TOP_DOCS = NewScoreMode(false, false)

	// TOP_DOCS_WITH_SCORES
	// ScoreMode for top field collectors that can provide their own iterators, to optionally allow to skip for non-competitive docs. This mode is used when there is a secondary sort by _score.
	TOP_DOCS_WITH_SCORES = NewScoreMode(false, true)
)
View Source
var EMPTY_TOPDOCS = &BaseTopDocs{
	totalHits: NewTotalHits(0, EQUAL_TO),
	scoreDocs: make([]ScoreDoc, 0),
}
View Source
var (
	LENGTH_TABLE [256]float64
)

Functions

func AsDocIdSetIterator

func AsDocIdSetIterator(twoPhaseIterator TwoPhaseIterator) types.DocIdSetIterator

func GetMaxClauseCount

func GetMaxClauseCount() int

GetMaxClauseCount Return the maximum number of clauses permitted, 1024 by default. Attempts to add more than the permitted number of clauses cause BooleanQuery.TooManyClauses to be thrown.

See Also: setMaxClauseCount(int)

func GetTermsEnum

func GetTermsEnum(r *automaton.CompiledAutomaton, terms index.Terms) (index.TermsEnum, error)

func IntersectIterators

func IntersectIterators(iterators []types.DocIdSetIterator) types.DocIdSetIterator

IntersectIterators Create a conjunction over the provided Scorers. Note that the returned DocIdSetIterator might leverage two-phase iteration in which case it is possible to retrieve the TwoPhaseIterator using TwoPhaseIterator.unwrap.

Types

type AutomatonQuery

type AutomatonQuery struct {
	// contains filtered or unexported fields
}

AutomatonQuery A Query that will match terms against a finite-state machine.

This query will match documents that contain terms accepted by a given finite-state machine. The automaton can be constructed with the org.apache.lucene.util.automaton API. Alternatively, it can be created from a regular expression with RegexpQuery or from the standard Lucene wildcard syntax with WildcardQuery.

When the query is executed, it will create an equivalent DFA of the finite-state machine, and will enumerate the term dictionary in an intelligent way to reduce the number of comparisons. For example: the regular expression of [dl]og? will make approximately four comparisons: do, dog, lo, and log. lucene.experimental

func NewAutomatonQuery

func NewAutomatonQuery(term *index.Term, auto *automaton.Automaton, determinizeWorkLimit int, isBinary bool) *AutomatonQuery

func (*AutomatonQuery) CreateWeight

func (r *AutomatonQuery) CreateWeight(searcher *IndexSearcher, scoreMode ScoreMode, boost float64) (Weight, error)

func (*AutomatonQuery) GetField

func (r *AutomatonQuery) GetField() string

func (*AutomatonQuery) GetRewriteMethod

func (r *AutomatonQuery) GetRewriteMethod() RewriteMethod

func (*AutomatonQuery) GetTermsEnum

func (r *AutomatonQuery) GetTermsEnum(terms index.Terms, atts *attribute.Source) (index.TermsEnum, error)

func (*AutomatonQuery) Rewrite

func (r *AutomatonQuery) Rewrite(reader index.IndexReader) (Query, error)

func (*AutomatonQuery) SetRewriteMethod

func (r *AutomatonQuery) SetRewriteMethod(method RewriteMethod)

func (*AutomatonQuery) String

func (r *AutomatonQuery) String(field string) string

func (*AutomatonQuery) Visit

func (r *AutomatonQuery) Visit(visitor QueryVisitor) error

type BM25Scorer

type BM25Scorer struct {
	*index.BaseSimScorer
	// contains filtered or unexported fields
}

func NewBM25Scorer

func NewBM25Scorer(boost, k1, b float64, idf *types.Explanation, avgdl float64, cache []float64) *BM25Scorer

func (*BM25Scorer) Score

func (b *BM25Scorer) Score(freq float64, norm int64) float64

type BM25Similarity

type BM25Similarity struct {
	// contains filtered or unexported fields
}

BM25Similarity BM25 Similarity. Introduced in Stephen E. Robertson, Steve Walker, Susan Jones, Micheline Hancock-Beaulieu, and Mike Gatford. Okapi at TREC-3. In Proceedings of the Third Text REtrieval Conference (TREC 1994). Gaithersburg, USA, November 1994.

func NewBM25Similarity

func NewBM25Similarity() (*BM25Similarity, error)

NewBM25Similarity BM25 with these default values: * k1 = 1.2 * b = 0.75

func NewBM25SimilarityV1

func NewBM25SimilarityV1(k1, b float64) (*BM25Similarity, error)

NewBM25SimilarityV1 BM25 with the supplied parameter values. k1: Controls non-linear term frequency normalization (saturation). b: Controls to what degree document length normalizes tf values. Throws: IllegalArgumentException – if k1 is infinite or negative, or if b is not within the range [0..1]

func NewCastBM25Similarity

func NewCastBM25Similarity() (*BM25Similarity, error)

func (*BM25Similarity) ComputeNorm

func (b *BM25Similarity) ComputeNorm(state *index.FieldInvertState) int64

func (*BM25Similarity) GetB

func (b *BM25Similarity) GetB() float64

func (*BM25Similarity) GetDiscountOverlaps

func (b *BM25Similarity) GetDiscountOverlaps() bool

GetDiscountOverlaps Returns true if overlap tokens are discounted from the document's length. See Also: setDiscountOverlaps

func (*BM25Similarity) GetK1

func (b *BM25Similarity) GetK1() float64

func (*BM25Similarity) IdfExplain

func (b *BM25Similarity) IdfExplain(
	collectionStats *types.CollectionStatistics, termStats *types.TermStatistics) *types.Explanation

IdfExplain Computes a score factor for a simple term and returns an explanation for that score factor. The default implementation uses:

idf(docFreq, docCount);

Note that CollectionStatistics.docCount() is used instead of Reader#numDocs() because also TermStatistics.docFreq() is used, and when the latter is inaccurate, so is CollectionStatistics.docCount(), and in the same direction. In addition, CollectionStatistics.docCount() does not skew when fields are sparse. Params: collectionStats – collection-level statistics

termStats – term-level statistics for the term

Returns: an Explain object that includes both an idf score factor and an explanation for the term.

func (*BM25Similarity) IdfExplainV1

func (b *BM25Similarity) IdfExplainV1(
	collectionStats *types.CollectionStatistics, termStats []types.TermStatistics) *types.Explanation

IdfExplainV1 Computes a score factor for a phrase. The default implementation sums the idf factor for each term in the phrase. collectionStats: collection-level statistics termStats: term-level statistics for the terms in the phrase Returns: an Explain object that includes both an idf score factor for the phrase and an explanation for each term.

func (*BM25Similarity) Scorer

func (b *BM25Similarity) Scorer(boost float64,
	collectionStats *types.CollectionStatistics, termStats []types.TermStatistics) index.SimScorer

func (*BM25Similarity) SetDiscountOverlaps

func (b *BM25Similarity) SetDiscountOverlaps(v bool)

SetDiscountOverlaps Sets whether overlap tokens (Tokens with 0 position increment) are ignored when computing norm. By default this is true, meaning overlap tokens do not count when computing norms.

func (*BM25Similarity) String

func (b *BM25Similarity) String() string

type BaseBulkScorer

type BaseBulkScorer struct {
	FnScoreRange func(collector LeafCollector, acceptDocs util.Bits, min, max int) (int, error)
	FnCost       func() int64
}

func (*BaseBulkScorer) Cost

func (b *BaseBulkScorer) Cost() int64

func (*BaseBulkScorer) Score

func (b *BaseBulkScorer) Score(collector LeafCollector, acceptDocs util.Bits) error

func (*BaseBulkScorer) ScoreRange

func (b *BaseBulkScorer) ScoreRange(collector LeafCollector, acceptDocs util.Bits, min, max int) (int, error)

type BaseFieldDoc

type BaseFieldDoc struct {
	// contains filtered or unexported fields
}

func NewFieldDoc

func NewFieldDoc(doc int, score float64) *BaseFieldDoc

NewFieldDoc Expert: Creates one of these objects with empty sort information.

func NewFieldDocV1

func NewFieldDocV1(doc int, score float64, fields []any) *BaseFieldDoc

NewFieldDocV1 Expert: Creates one of these objects with the given sort information.

func NewFieldDocV2

func NewFieldDocV2(doc int, score float64, fields []any, shardIndex int) *BaseFieldDoc

NewFieldDocV2 Expert: Creates one of these objects with the given sort information.

func (BaseFieldDoc) GetDoc

func (s BaseFieldDoc) GetDoc() int

func (*BaseFieldDoc) GetFields

func (f *BaseFieldDoc) GetFields() []any

func (BaseFieldDoc) GetScore

func (s BaseFieldDoc) GetScore() float64

func (BaseFieldDoc) GetShardIndex

func (s BaseFieldDoc) GetShardIndex() int

func (BaseFieldDoc) SetDoc

func (s BaseFieldDoc) SetDoc(doc int)

func (*BaseFieldDoc) SetFields

func (f *BaseFieldDoc) SetFields(fields []any)

func (BaseFieldDoc) SetScore

func (s BaseFieldDoc) SetScore(score float64)

func (BaseFieldDoc) SetShardIndex

func (s BaseFieldDoc) SetShardIndex(shardIndex int)

type BaseScorable

type BaseScorable struct {
}

func (*BaseScorable) GetChildren

func (*BaseScorable) GetChildren() ([]ChildScorable, error)

func (*BaseScorable) SetMinCompetitiveScore

func (*BaseScorable) SetMinCompetitiveScore(minScore float64) error

func (*BaseScorable) SmoothingScore

func (*BaseScorable) SmoothingScore(docId int) (float64, error)

type BaseScorer

type BaseScorer struct {
	*BaseScorable
	// contains filtered or unexported fields
}

func NewScorer

func NewScorer(weight Weight) *BaseScorer

func (*BaseScorer) AdvanceShallow

func (s *BaseScorer) AdvanceShallow(target int) (int, error)

func (*BaseScorer) GetWeight

func (s *BaseScorer) GetWeight() Weight

func (*BaseScorer) TwoPhaseIterator

func (s *BaseScorer) TwoPhaseIterator() TwoPhaseIterator

type BaseSimpleCollector

type BaseSimpleCollector struct {
	SimpleCollectorSPI
	// contains filtered or unexported fields
}

func NewSimpleCollector

func NewSimpleCollector(spi SimpleCollectorSPI) *BaseSimpleCollector

func (BaseSimpleCollector) CompetitiveIterator

func (BaseSimpleCollector) CompetitiveIterator() (types.DocIdSetIterator, error)

func (*BaseSimpleCollector) GetLeafCollector

func (s *BaseSimpleCollector) GetLeafCollector(ctx context.Context, readerContext index.LeafReaderContext) (LeafCollector, error)

type BaseTopDocs

type BaseTopDocs struct {
	// contains filtered or unexported fields
}

BaseTopDocs Represents hits returned by IndexSearcher.search(Query, int).

func NewTopDocs

func NewTopDocs(totalHits *TotalHits, scoreDocs []ScoreDoc) *BaseTopDocs

NewTopDocs Constructs a TopDocs.

func (*BaseTopDocs) GetScoreDocs

func (t *BaseTopDocs) GetScoreDocs() []ScoreDoc

func (*BaseTopDocs) GetTotalHits

func (t *BaseTopDocs) GetTotalHits() *TotalHits

type BaseTopScoreDocCollector

type BaseTopScoreDocCollector struct {
	*TopDocsCollectorDefault[ScoreDoc]
	// contains filtered or unexported fields
}

type BaseWeight

type BaseWeight struct {
	// contains filtered or unexported fields
}

func NewBaseWeight

func NewBaseWeight(parentQuery Query, scorer WeightScorer) *BaseWeight

func (*BaseWeight) BulkScorer

func (r *BaseWeight) BulkScorer(ctx index.LeafReaderContext) (BulkScorer, error)

func (*BaseWeight) GetQuery

func (r *BaseWeight) GetQuery() Query

func (*BaseWeight) Matches

func (r *BaseWeight) Matches(ctx index.LeafReaderContext, doc int) (Matches, error)

func (*BaseWeight) ScorerSupplier

func (r *BaseWeight) ScorerSupplier(ctx index.LeafReaderContext) (ScorerSupplier, error)

type BitDocIdSet

type BitDocIdSet struct {
	// contains filtered or unexported fields
}

func NewBitDocIdSet

func NewBitDocIdSet(set *bitset.BitSet, cost int64) *BitDocIdSet

func (BitDocIdSet) Bits

func (b BitDocIdSet) Bits() util.Bits

func (BitDocIdSet) Iterator

func (b BitDocIdSet) Iterator() types.DocIdSetIterator

type BitSetConjunctionDISI

type BitSetConjunctionDISI struct {
	// contains filtered or unexported fields
}

func (*BitSetConjunctionDISI) Advance

func (b *BitSetConjunctionDISI) Advance(target int) (int, error)

func (*BitSetConjunctionDISI) Cost

func (b *BitSetConjunctionDISI) Cost() int64

func (*BitSetConjunctionDISI) DocID

func (b *BitSetConjunctionDISI) DocID() int

func (*BitSetConjunctionDISI) NextDoc

func (b *BitSetConjunctionDISI) NextDoc() (int, error)

func (*BitSetConjunctionDISI) SlowAdvance

func (b *BitSetConjunctionDISI) SlowAdvance(target int) (int, error)

type BlockMaxConjunctionScorer

type BlockMaxConjunctionScorer struct {
	*BaseScorer
	// contains filtered or unexported fields
}

func NewBlockMaxConjunctionScorer

func NewBlockMaxConjunctionScorer(weight Weight, scorersList []Scorer) (*BlockMaxConjunctionScorer, error)

func (*BlockMaxConjunctionScorer) DocID

func (b *BlockMaxConjunctionScorer) DocID() int

func (*BlockMaxConjunctionScorer) GetChildren

func (b *BlockMaxConjunctionScorer) GetChildren() ([]ChildScorable, error)

func (*BlockMaxConjunctionScorer) GetMaxScore

func (b *BlockMaxConjunctionScorer) GetMaxScore(upTo int) (float64, error)

func (*BlockMaxConjunctionScorer) Iterator

func (*BlockMaxConjunctionScorer) Score

func (b *BlockMaxConjunctionScorer) Score() (float64, error)

func (*BlockMaxConjunctionScorer) SetMinCompetitiveScore

func (b *BlockMaxConjunctionScorer) SetMinCompetitiveScore(score float64) error

func (*BlockMaxConjunctionScorer) TwoPhaseIterator

func (b *BlockMaxConjunctionScorer) TwoPhaseIterator() TwoPhaseIterator

type BlockMaxDISI

type BlockMaxDISI struct {
}

func (*BlockMaxDISI) Advance

func (b *BlockMaxDISI) Advance(target int) (int, error)

func (*BlockMaxDISI) Cost

func (b *BlockMaxDISI) Cost() int64

func (*BlockMaxDISI) DocID

func (b *BlockMaxDISI) DocID() int

func (*BlockMaxDISI) NextDoc

func (b *BlockMaxDISI) NextDoc() (int, error)

func (*BlockMaxDISI) SlowAdvance

func (b *BlockMaxDISI) SlowAdvance(target int) (int, error)

type Boolean2ScorerSupplier

type Boolean2ScorerSupplier struct {
	// contains filtered or unexported fields
}

func NewBoolean2ScorerSupplier

func NewBoolean2ScorerSupplier(weight Weight, subs map[Occur][]ScorerSupplier,
	scoreMode ScoreMode, minShouldMatch int) (*Boolean2ScorerSupplier, error)

func (*Boolean2ScorerSupplier) Cost

func (b *Boolean2ScorerSupplier) Cost() int64

func (*Boolean2ScorerSupplier) Get

func (b *Boolean2ScorerSupplier) Get(leadCost int64) (Scorer, error)

type BooleanClause

type BooleanClause struct {
	// contains filtered or unexported fields
}

BooleanClause A clause in a BooleanQuery.

func NewBooleanClause

func NewBooleanClause(query Query, occur Occur) *BooleanClause

func (*BooleanClause) GetOccur

func (b *BooleanClause) GetOccur() Occur

func (*BooleanClause) GetQuery

func (b *BooleanClause) GetQuery() Query

func (*BooleanClause) IsProhibited

func (b *BooleanClause) IsProhibited() bool

func (*BooleanClause) IsRequired

func (b *BooleanClause) IsRequired() bool

func (*BooleanClause) IsScoring

func (b *BooleanClause) IsScoring() bool

func (*BooleanClause) String

func (b *BooleanClause) String() string

type BooleanQuery

type BooleanQuery struct {
	// contains filtered or unexported fields
}

BooleanQuery A Query that matches documents matching boolean combinations of other queries, e.g. TermQuerys, PhraseQuerys or other BooleanQuerys.

func (*BooleanQuery) Clauses

func (b *BooleanQuery) Clauses() []*BooleanClause

Clauses Return a list of the clauses of this BooleanQuery.

func (*BooleanQuery) CreateWeight

func (b *BooleanQuery) CreateWeight(searcher *IndexSearcher, scoreMode ScoreMode, boost float64) (Weight, error)

func (*BooleanQuery) GetClauses

func (b *BooleanQuery) GetClauses(occur Occur) []Query

GetClauses Return the collection of queries for the given BooleanClause.Occur.

func (*BooleanQuery) GetMinimumNumberShouldMatch

func (b *BooleanQuery) GetMinimumNumberShouldMatch() int

GetMinimumNumberShouldMatch Gets the minimum number of the optional BooleanClauses which must be satisfied.

func (*BooleanQuery) Iterator

func (b *BooleanQuery) Iterator() arraylist.Iterator[*BooleanClause]

func (*BooleanQuery) Rewrite

func (b *BooleanQuery) Rewrite(reader index.IndexReader) (Query, error)

func (*BooleanQuery) String

func (b *BooleanQuery) String(field string) string

func (*BooleanQuery) Visit

func (b *BooleanQuery) Visit(visitor QueryVisitor) error

type BooleanQueryBuilder

type BooleanQueryBuilder struct {
	// contains filtered or unexported fields
}

BooleanQueryBuilder A builder for boolean queries.

func NewBooleanQueryBuilder

func NewBooleanQueryBuilder() *BooleanQueryBuilder

func (*BooleanQueryBuilder) Add

Add a new clause to this BooleanQuery.Builder. Note that the order in which clauses are added does not have any impact on matching documents or query performance. Throws: BooleanQuery.TooManyClauses – if the new number of clauses exceeds the maximum clause number

func (*BooleanQueryBuilder) AddQuery

func (b *BooleanQueryBuilder) AddQuery(query Query, occur Occur) *BooleanQueryBuilder

AddQuery a new clause to this BooleanQuery.Builder. Note that the order in which clauses are added does not have any impact on matching documents or query performance. Throws: BooleanQuery.TooManyClauses – if the new number of clauses exceeds the maximum clause number

func (*BooleanQueryBuilder) Build

func (b *BooleanQueryBuilder) Build() (*BooleanQuery, error)

func (*BooleanQueryBuilder) SetMinimumNumberShouldMatch

func (b *BooleanQueryBuilder) SetMinimumNumberShouldMatch(min int) *BooleanQueryBuilder

SetMinimumNumberShouldMatch Specifies a minimum number of the optional BooleanClauses which must be satisfied. By default no optional clauses are necessary for a match (unless there are no required clauses). If this method is used, then the specified number of clauses is required. Use of this method is totally independent of specifying that any specific clauses are required (or prohibited). This number will only be compared against the number of matching optional clauses. Params: min – the number of optional clauses that must match

type BooleanWeight

type BooleanWeight struct {
	*BaseWeight
	// contains filtered or unexported fields
}

BooleanWeight Expert: the Weight for BooleanQuery, used to normalize, score and explain these queries.

func NewBooleanWeight

func NewBooleanWeight(query *BooleanQuery, searcher *IndexSearcher,
	scoreMode ScoreMode, boost float64) (*BooleanWeight, error)

func (*BooleanWeight) BulkScorer

func (b *BooleanWeight) BulkScorer(context index.LeafReaderContext) (BulkScorer, error)

func (*BooleanWeight) Explain

func (b *BooleanWeight) Explain(ctx index.LeafReaderContext, doc int) (*types.Explanation, error)

func (*BooleanWeight) ExtractTerms

func (b *BooleanWeight) ExtractTerms(terms *treeset.Set[*index.Term]) error

func (*BooleanWeight) IsCacheable

func (b *BooleanWeight) IsCacheable(ctx index.LeafReaderContext) bool

func (*BooleanWeight) Matches

func (b *BooleanWeight) Matches(context index.LeafReaderContext, doc int) (Matches, error)

func (*BooleanWeight) Scorer

func (b *BooleanWeight) Scorer(ctx index.LeafReaderContext) (Scorer, error)

func (*BooleanWeight) ScorerSupplier

func (b *BooleanWeight) ScorerSupplier(context index.LeafReaderContext) (ScorerSupplier, error)

type BoostQuery

type BoostQuery struct {
	// contains filtered or unexported fields
}

BoostQuery A Query wrapper that allows to give a boost to the wrapped query. Boost values that are less than one will give less importance to this query compared to other ones while values that are greater than one will give more importance to the scores returned by this query. More complex boosts can be applied by using FunctionScoreQuery in the lucene-queries module

func NewBoostQuery

func NewBoostQuery(query Query, boost float64) (*BoostQuery, error)

func (*BoostQuery) CreateWeight

func (b *BoostQuery) CreateWeight(searcher *IndexSearcher, scoreMode ScoreMode, boost float64) (Weight, error)

func (*BoostQuery) GetBoost

func (b *BoostQuery) GetBoost() float64

func (*BoostQuery) GetQuery

func (b *BoostQuery) GetQuery() Query

func (*BoostQuery) Rewrite

func (b *BoostQuery) Rewrite(reader index.IndexReader) (Query, error)

func (*BoostQuery) String

func (b *BoostQuery) String(field string) string

func (*BoostQuery) Visit

func (b *BoostQuery) Visit(visitor QueryVisitor) error

type Buffer

type Buffer struct {
	// contains filtered or unexported fields
}

func NewBuffer

func NewBuffer(array []int, length int) *Buffer

func NewBufferBySize

func NewBufferBySize(size int) *Buffer

type BufferAdder

type BufferAdder struct {
	// contains filtered or unexported fields
}

func NewBufferAdder

func NewBufferAdder(buffer *Buffer) *BufferAdder

func (*BufferAdder) Add

func (b *BufferAdder) Add(doc int)

type BulkAdder

type BulkAdder interface {
	Add(doc int)
}

BulkAdder Utility class to efficiently add many docs in one go. See Also: grow

type BulkScorer

type BulkScorer interface {
	// Score Scores and collects all matching documents.
	// Params: 	collector – The collector to which all matching documents are passed.
	//			acceptDocs – Bits that represents the allowed documents to match, or null if they are all allowed to match.
	Score(collector LeafCollector, acceptDocs util.Bits) error

	// ScoreRange
	// Params:
	// 		collector – The collector to which all matching documents are passed.
	// 		acceptDocs – Bits that represents the allowed documents to match, or null if they are all allowed to match.
	// 		min – Score starting at, including, this document
	// 		max – Score up to, but not including, this doc
	// Returns: an under-estimation of the next matching doc after max
	ScoreRange(collector LeafCollector, acceptDocs util.Bits, min, max int) (int, error)

	// Cost Same as DocIdSetIterator.cost() for bulk scorers.
	Cost() int64
}

BulkScorer This class is used to Score a range of documents at once, and is returned by Weight.bulkScorer. Only queries that have a more optimized means of scoring across a range of documents need to override this. Otherwise, a default implementation is wrapped around the Scorer returned by Weight.scorer.

GPT3.5: 这个类用于一次对一系列文档进行评分,它是由Weight.bulkScorer返回的。 只有那些在一系列文档上有更优化的评分方法的查询才需要覆盖它。 否则,会使用默认实现,该实现会封装在Weight.scorer返回的Scorer周围。

type BulkScorerAnon

type BulkScorerAnon struct {
	FnScore      func(collector LeafCollector, acceptDocs util.Bits) error
	FnScoreRange func(collector LeafCollector, acceptDocs util.Bits, min, max int) (int, error)
	FnCost       func() int64
}

func (*BulkScorerAnon) Cost

func (b *BulkScorerAnon) Cost() int64

func (*BulkScorerAnon) Score

func (b *BulkScorerAnon) Score(collector LeafCollector, acceptDocs util.Bits) error

func (*BulkScorerAnon) ScoreRange

func (b *BulkScorerAnon) ScoreRange(collector LeafCollector, acceptDocs util.Bits, min, max int) (int, error)

type BulkScorerSPI

type BulkScorerSPI interface {
	ScoreRange(collector LeafCollector, acceptDocs util.Bits, min, max int) (int, error)
	Cost() int64
}

type ChildScorable

type ChildScorable struct {

	// Child Scorer. (note this is typically a direct child, and may itself also have children).
	Child Scorable

	// An arbitrary string relating this scorer to the parent.
	Relationship string
}

ChildScorable A child Scorer and its relationship to its parent. the meaning of the relationship depends upon the parent query.

func NewChildScorable

func NewChildScorable(child Scorable, relationship string) *ChildScorable

type Collector

type Collector interface {

	// GetLeafCollector
	// Create a new collector to collect the given context.
	// readerContext: next atomic reader context
	// Lucene每处理完一个段,就会调用该方法获得下一个段对应的LeafCollector对象。
	GetLeafCollector(ctx context.Context, readerContext index.LeafReaderContext) (LeafCollector, error)

	// ScoreMode
	// Indicates what features are required from the scorer.
	ScoreMode() ScoreMode
}

Collector

Expert: Collectors are primarily meant to be used to gather raw results from a search, and implement sorting or custom result filtering, collation, etc. Lucene's core collectors are derived from Collector and SimpleCollector. Likely your application can use one of these classes, or subclass TopDocsCollector, instead of implementing Collector directly:

  • TopDocsCollector is an abstract base class that assumes you will retrieve the top N docs, according to some criteria, after collection is done.
  • TopScoreDocCollector is a concrete subclass TopDocsCollector and sorts according to score + docID. This is used internally by the IndexSearcher search methods that do not take an explicit Sort. It is likely the most frequently used collector.
  • TopFieldCollector subclasses TopDocsCollector and sorts according to a specified Sort object (sort by field). This is used internally by the IndexSearcher search methods that take an explicit Sort.

TimeLimitingCollector, which wraps any other Collector and aborts the search if it's taken too much time. PositiveScoresOnlyCollector wraps any other Collector and prevents collection of hits whose score is <= 0.0

type CollectorManager

type CollectorManager interface {
	NewCollector() (Collector, error)
	Reduce(collectors []Collector) (any, error)
}

CollectorManager A manager of collectors. This class is useful to parallelize execution of search requests and has two main methods:

  • NewCollector() which must return a NEW collector which will be used to collect a certain set of leaves.
  • Reduce(Collection) which will be used to reduce the results of individual collections into a meaningful result. This method is only called after all leaves have been fully collected.

See Also: IndexSearcher.search(Query, CollectorManager) lucene.experimental

type ConjunctionDISI

type ConjunctionDISI struct {
	// contains filtered or unexported fields
}

ConjunctionDISI A conjunction of DocIdSetIterators. Requires that all of its sub-iterators must be on the same document all the time. This iterates over the doc ids that are present in each given DocIdSetIterator. Public only for use in org.apache.lucene.search.spans. lucene.internal

func (*ConjunctionDISI) Advance

func (c *ConjunctionDISI) Advance(target int) (int, error)

func (*ConjunctionDISI) Cost

func (c *ConjunctionDISI) Cost() int64

func (*ConjunctionDISI) DocID

func (c *ConjunctionDISI) DocID() int

func (*ConjunctionDISI) NextDoc

func (c *ConjunctionDISI) NextDoc() (int, error)

func (*ConjunctionDISI) SlowAdvance

func (c *ConjunctionDISI) SlowAdvance(target int) (int, error)

type ConjunctionScorer

type ConjunctionScorer struct {
	*BaseScorer
	// contains filtered or unexported fields
}

ConjunctionScorer Create a new ConjunctionScorer, note that scorers must be a subset of required.

func NewConjunctionScorer

func NewConjunctionScorer(weight Weight, scorers []Scorer, required []Scorer) (*ConjunctionScorer, error)

func (*ConjunctionScorer) DocID

func (c *ConjunctionScorer) DocID() int

func (*ConjunctionScorer) GetMaxScore

func (c *ConjunctionScorer) GetMaxScore(upTo int) (float64, error)

func (*ConjunctionScorer) Iterator

func (*ConjunctionScorer) Score

func (c *ConjunctionScorer) Score() (float64, error)

func (*ConjunctionScorer) TwoPhaseIterator

func (c *ConjunctionScorer) TwoPhaseIterator() TwoPhaseIterator

type ConjunctionTwoPhaseIterator

type ConjunctionTwoPhaseIterator struct {
	// contains filtered or unexported fields
}

func (*ConjunctionTwoPhaseIterator) Approximation

func (*ConjunctionTwoPhaseIterator) MatchCost

func (c *ConjunctionTwoPhaseIterator) MatchCost() float64

func (*ConjunctionTwoPhaseIterator) Matches

func (c *ConjunctionTwoPhaseIterator) Matches() (bool, error)

type ConstantScoreQuery

type ConstantScoreQuery struct {
	// contains filtered or unexported fields
}

ConstantScoreQuery A query that wraps another query and simply returns a constant score equal to 1 for every document that matches the query. It therefore simply strips of all scores and always returns 1.

func NewConstantScoreQuery

func NewConstantScoreQuery(query Query) *ConstantScoreQuery

func (*ConstantScoreQuery) CreateWeight

func (c *ConstantScoreQuery) CreateWeight(searcher *IndexSearcher, scoreMode ScoreMode, boost float64) (Weight, error)

func (*ConstantScoreQuery) GetQuery

func (c *ConstantScoreQuery) GetQuery() Query

func (*ConstantScoreQuery) Rewrite

func (c *ConstantScoreQuery) Rewrite(reader index.IndexReader) (Query, error)

func (*ConstantScoreQuery) String

func (c *ConstantScoreQuery) String(field string) string

func (*ConstantScoreQuery) Visit

func (c *ConstantScoreQuery) Visit(visitor QueryVisitor) (err error)

type ConstantScoreScorer

type ConstantScoreScorer struct {
	*BaseScorer
	// contains filtered or unexported fields
}

func NewConstantScoreScorer

func NewConstantScoreScorer(weight Weight, score float64,
	scoreMode ScoreMode, disi types.DocIdSetIterator) (*ConstantScoreScorer, error)

NewConstantScoreScorer Constructor based on a DocIdSetIterator which will be used to drive iteration. Two phase iteration will not be supported.

weight: the parent weight
score: the score to return on each document
scoreMode: the score mode
disi: the iterator that defines matching documents

func NewConstantScoreScorerV1

func NewConstantScoreScorerV1(weight Weight, score float64,
	scoreMode ScoreMode, twoPhaseIterator TwoPhaseIterator) (*ConstantScoreScorer, error)

func (*ConstantScoreScorer) DocID

func (c *ConstantScoreScorer) DocID() int

func (*ConstantScoreScorer) GetMaxScore

func (c *ConstantScoreScorer) GetMaxScore(upTo int) (float64, error)

func (*ConstantScoreScorer) Iterator

func (*ConstantScoreScorer) Score

func (c *ConstantScoreScorer) Score() (float64, error)

type ConstantScoreWeight

type ConstantScoreWeight struct {
	*BaseWeight
	// contains filtered or unexported fields
}

func NewConstantScoreWeight

func NewConstantScoreWeight(score float64, query Query, spi WeightScorer) *ConstantScoreWeight

func (*ConstantScoreWeight) Explain

func (*ConstantScoreWeight) ExtractTerms

func (c *ConstantScoreWeight) ExtractTerms(terms *treeset.Set[*index.Term]) error

func (*ConstantScoreWeight) Score

func (c *ConstantScoreWeight) Score() float64

type DefaultBulkScorer

type DefaultBulkScorer struct {
	// contains filtered or unexported fields
}

func NewDefaultBulkScorer

func NewDefaultBulkScorer(scorer Scorer) *DefaultBulkScorer

func (*DefaultBulkScorer) Cost

func (d *DefaultBulkScorer) Cost() int64

func (*DefaultBulkScorer) Score

func (d *DefaultBulkScorer) Score(collector LeafCollector, acceptDocs util.Bits) error

func (*DefaultBulkScorer) ScoreRange

func (d *DefaultBulkScorer) ScoreRange(collector LeafCollector, acceptDocs util.Bits, min, max int) (int, error)

type DisiPriorityQueue

type DisiPriorityQueue struct {
}

DisiPriorityQueue A priority queue of DocIdSetIterators that orders by current doc ID. This specialization is needed over PriorityQueue because the pluggable comparison function makes the rebalancing quite slow. lucene.internal

type DisiWrapper

type DisiWrapper struct {
	// contains filtered or unexported fields
}

DisiWrapper Wrapper used in DisiPriorityQueue. lucene.internal

type DisjunctionMatchesIterator

type DisjunctionMatchesIterator struct {
	// contains filtered or unexported fields
}

func (*DisjunctionMatchesIterator) EndOffset

func (d *DisjunctionMatchesIterator) EndOffset() (int, error)

func (*DisjunctionMatchesIterator) EndPosition

func (d *DisjunctionMatchesIterator) EndPosition() int

func (*DisjunctionMatchesIterator) GetQuery

func (d *DisjunctionMatchesIterator) GetQuery() Query

func (*DisjunctionMatchesIterator) GetSubMatches

func (d *DisjunctionMatchesIterator) GetSubMatches() (MatchesIterator, error)

func (*DisjunctionMatchesIterator) Next

func (d *DisjunctionMatchesIterator) Next() (bool, error)

func (*DisjunctionMatchesIterator) StartOffset

func (d *DisjunctionMatchesIterator) StartOffset() (int, error)

func (*DisjunctionMatchesIterator) StartPosition

func (d *DisjunctionMatchesIterator) StartPosition() int

type DisjunctionScorer

type DisjunctionScorer struct {
	*BaseScorer
	// contains filtered or unexported fields
}

DisjunctionScorer Base class for Scorers that score disjunctions.

type DisjunctionSumScorer

type DisjunctionSumScorer struct {
	*DisjunctionScorer
}

DisjunctionSumScorer A Scorer for OR like queries, counterpart of ConjunctionScorer.

func (*DisjunctionSumScorer) DocID

func (d *DisjunctionSumScorer) DocID() int

func (*DisjunctionSumScorer) GetMaxScore

func (d *DisjunctionSumScorer) GetMaxScore(upTo int) (float64, error)

func (*DisjunctionSumScorer) Iterator

func (*DisjunctionSumScorer) Score

func (d *DisjunctionSumScorer) Score() (float64, error)

type DocAndScore

type DocAndScore struct {
	// contains filtered or unexported fields
}

func NewDocAndScore

func NewDocAndScore(docBase int, score float64) *DocAndScore

type DocIdSet

type DocIdSet interface {
	// Iterator
	// Provides a DocIdSetIterator to access the set. This implementation can return null if there are no docs that match.
	Iterator() types.DocIdSetIterator

	// Bits
	// TODO: somehow this class should express the cost of
	// iteration vs the cost of random access Bits; for
	// expensive Filters (e.g. distance < 1 km) we should use
	// bits() after all other Query/Filters have matched, but
	// this is the opposite of what bits() is for now
	// (down-low filtering using e.g. FixedBitSet)
	//
	// Optionally provides a Bits interface for random access to matching documents.
	// Returns: null, if this DocIdSet does not support random access. In contrast to iterator(),
	//		a return value of null does not imply that no documents match the filter! The default
	//		implementation does not provide random access, so you only need to implement this method
	//		if your DocIdSet can guarantee random access to every docid in O(1) time without external
	//		disk access (as Bits interface cannot throw IOException). This is generally true for bit
	//		sets like org.apache.lucene.util.FixedBitSet, which return itself if they are used as DocIdSet.
	Bits() util.Bits
}

A DocIdSet contains a set of doc ids. Implementing classes must only implement iterator to provide access to the set.

func GetEmptyDocIdSet

func GetEmptyDocIdSet() DocIdSet

type DocIdSetBuilder

type DocIdSetBuilder struct {
	// contains filtered or unexported fields
}

DocIdSetBuilder A builder of DocIdSets. At first it uses a sparse structure to gather documents, and then upgrades to a non-sparse bit set once enough hits match. To add documents, you first need to call grow in order to reserve space, and then call DocIdSetBuilder.BulkAdder.add(int) on the returned DocIdSetBuilder.BulkAdder. lucene.internal

func NewDocIdSetBuilder

func NewDocIdSetBuilder(maxDoc int) *DocIdSetBuilder

NewDocIdSetBuilder Create a builder that can contain doc IDs between 0 and maxDoc.

func NewDocIdSetBuilderV1

func NewDocIdSetBuilderV1(maxDoc int, terms index.Terms) (*DocIdSetBuilder, error)

NewDocIdSetBuilderV1 Create a DocIdSetBuilder instance that is optimized for accumulating docs that match the given Terms.

func NewDocIdSetBuilderV2

func NewDocIdSetBuilderV2(maxDoc int, values types.PointValues, field string) *DocIdSetBuilder

NewDocIdSetBuilderV2 Create a DocIdSetBuilder instance that is optimized for accumulating docs that match the given PointValues.

func (*DocIdSetBuilder) Add

Add the content of the provided DocIdSetIterator to this builder. NOTE: if you need to build a DocIdSet out of a single DocIdSetIterator, you should rather use RoaringDocIdSet.Builder.

func (*DocIdSetBuilder) Build

func (d *DocIdSetBuilder) Build() DocIdSet

Build a DocIdSet from the accumulated doc IDs.

func (*DocIdSetBuilder) Grow

func (d *DocIdSetBuilder) Grow(numDocs int) BulkAdder

Grow Reserve space and return a DocIdSetBuilder.BulkAdder object that can be used to add up to numDocs documents.

type Entry

type Entry struct {
	// contains filtered or unexported fields
}

func NewEntry

func NewEntry(slot, doc int) *Entry

func (Entry) GetDoc

func (s Entry) GetDoc() int

func (Entry) GetScore

func (s Entry) GetScore() float64

func (Entry) GetShardIndex

func (s Entry) GetShardIndex() int

func (Entry) SetDoc

func (s Entry) SetDoc(doc int)

func (Entry) SetScore

func (s Entry) SetScore(score float64)

func (Entry) SetShardIndex

func (s Entry) SetShardIndex(shardIndex int)

type Executor

type Executor interface {
}

type FieldDoc

type FieldDoc interface {
	ScoreDoc

	GetFields() []any
	SetFields(fields []any)
}

FieldDoc Expert: A ScoreDoc which also contains information about how to sort the referenced document. In addition to the document number and score, this object contains an array of values for the document from the field(s) used to sort. For example, if the sort criteria was to sort by fields "a", "b" then "c", the fields object array will have three elements, corresponding respectively to the term values for the document in fields "a", "b" and "c". The class of each element in the array will be either Integer, Float or String depending on the type of values in the terms of each field.

Created: Feb 11, 2004 1:23:38 PM Since: lucene 1.4 See Also: ScoreDoc, TopFieldDocs

type FieldValueHitQueue

type FieldValueHitQueue[T ScoreDoc] interface {
	Add(element T) T
	Top() T
	Pop() (T, error)
	UpdateTop() T
	UpdateTopByNewTop(newTop T) T
	Size() int
	Clear()
	Remove(element T) bool
	Iterator() structure.Iterator[T]
	GetReverseMul() []int
	GetComparators(ctx index.LeafReaderContext) ([]index.LeafFieldComparator, error)
	GetComparatorsList() []index.FieldComparator
}

FieldValueHitQueue Expert: A hit queue for sorting by hits by terms in more than one field. Since: 2.9 See Also: IndexSearcher.search(Query, int, Sort) lucene.experimental

func CreateFieldValueHitQueue

func CreateFieldValueHitQueue(fields []index.SortField, size int) FieldValueHitQueue[*Entry]

CreateFieldValueHitQueue Creates a hit queue sorted by the given list of fields. NOTE: The instances returned by this method pre-allocate a full array of length numHits. Params: fields – SortField array we are sorting by in priority order (highest priority first);

		 cannot be null or empty
size – The number of hits to retain. Must be greater than zero.

type FieldValueHitQueueDefault

type FieldValueHitQueueDefault[T any] struct {
	*structure.PriorityQueue[T]
	// contains filtered or unexported fields
}

func (*FieldValueHitQueueDefault[T]) GetComparators

func (*FieldValueHitQueueDefault[T]) GetComparatorsList

func (f *FieldValueHitQueueDefault[T]) GetComparatorsList() []index.FieldComparator

func (*FieldValueHitQueueDefault[T]) GetReverseMul

func (f *FieldValueHitQueueDefault[T]) GetReverseMul() []int

type FilterLeafCollector

type FilterLeafCollector struct {
	// contains filtered or unexported fields
}

type FilterScorer

type FilterScorer struct {
	*BaseScorer
	// contains filtered or unexported fields
}

A FilterScorer contains another Scorer, which it uses as its basic source of data, possibly transforming the data along the way or providing additional functionality. The class FilterScorer itself simply implements all abstract methods of Scorer with versions that pass all requests to the contained scorer. Subclasses of FilterScorer may further override some of these methods and may also provide additional methods and fields.

func (*FilterScorer) DocID

func (f *FilterScorer) DocID() int

func (*FilterScorer) Iterator

func (f *FilterScorer) Iterator() types.DocIdSetIterator

func (*FilterScorer) Score

func (f *FilterScorer) Score() (float64, error)

func (*FilterScorer) TwoPhaseIterator

func (f *FilterScorer) TwoPhaseIterator() TwoPhaseIterator

type FixedBitSetAdder

type FixedBitSetAdder struct {
	// contains filtered or unexported fields
}

func NewFixedBitSetAdder

func NewFixedBitSetAdder(bitSet *bitset.BitSet) *FixedBitSetAdder

func (*FixedBitSetAdder) Add

func (f *FixedBitSetAdder) Add(doc int)

type GlobalHitsThresholdChecker

type GlobalHitsThresholdChecker struct {
	// contains filtered or unexported fields
}

GlobalHitsThresholdChecker Implementation of HitsThresholdChecker which allows global hit counting

func NewGlobalHitsThresholdChecker

func NewGlobalHitsThresholdChecker(totalHitsThreshold int) (*GlobalHitsThresholdChecker, error)

func (*GlobalHitsThresholdChecker) GetHitsThreshold

func (g *GlobalHitsThresholdChecker) GetHitsThreshold() int

func (*GlobalHitsThresholdChecker) IncrementHitCount

func (g *GlobalHitsThresholdChecker) IncrementHitCount()

func (*GlobalHitsThresholdChecker) IsThresholdReached

func (g *GlobalHitsThresholdChecker) IsThresholdReached() bool

func (*GlobalHitsThresholdChecker) ScoreMode

func (g *GlobalHitsThresholdChecker) ScoreMode() ScoreMode

type HitsThresholdChecker

type HitsThresholdChecker interface {
	IncrementHitCount()
	ScoreMode() ScoreMode
	GetHitsThreshold() int
	IsThresholdReached() bool
}

HitsThresholdChecker Used for defining custom algorithms to allow searches to early terminate

func HitsThresholdCheckerCreate

func HitsThresholdCheckerCreate(totalHitsThreshold int) (HitsThresholdChecker, error)

func HitsThresholdCheckerCreateShared

func HitsThresholdCheckerCreateShared(totalHitsThreshold int) (HitsThresholdChecker, error)

HitsThresholdCheckerCreateShared Returns a threshold checker that is based on a shared counter

type IOSupplier

type IOSupplier[T any] interface {
	Get() (T, error)
}

type ImpactsDISI

type ImpactsDISI struct {
	// contains filtered or unexported fields
}

ImpactsDISI DocIdSetIterator that skips non-competitive docs thanks to the indexed impacts. Call SetMinCompetitiveScore(float) in order to give this iterator the ability to skip low-scoring documents.

func NewImpactsDISI

func NewImpactsDISI(in types.DocIdSetIterator, impactsSource index.ImpactsSource, scorer index.SimScorer) *ImpactsDISI

func (*ImpactsDISI) Advance

func (d *ImpactsDISI) Advance(target int) (int, error)

func (*ImpactsDISI) Cost

func (d *ImpactsDISI) Cost() int64

func (*ImpactsDISI) DocID

func (d *ImpactsDISI) DocID() int

func (*ImpactsDISI) GetMaxScore

func (d *ImpactsDISI) GetMaxScore(upTo int) (float64, error)

GetMaxScore Implement the contract of Scorer.GetMaxScore(int) based on the wrapped ImpactsEnum and Scorer. See Also: Scorer.GetMaxScore(int)

func (*ImpactsDISI) NextDoc

func (d *ImpactsDISI) NextDoc() (int, error)

func (*ImpactsDISI) SlowAdvance

func (d *ImpactsDISI) SlowAdvance(target int) (int, error)

type InPlaceMergeSorter

type InPlaceMergeSorter struct {
	// contains filtered or unexported fields
}

func (InPlaceMergeSorter) Len

func (r InPlaceMergeSorter) Len() int

func (InPlaceMergeSorter) Less

func (r InPlaceMergeSorter) Less(i, j int) bool

func (InPlaceMergeSorter) Swap

func (r InPlaceMergeSorter) Swap(i, j int)

type IndexSearcher

type IndexSearcher struct {
	// contains filtered or unexported fields
}

IndexSearcher Implements search over a single Reader. Applications usually need only call the inherited search(Query, int) method. For performance reasons, if your index is unchanging, you should share a single IndexSearcher instance across multiple searches instead of creating a new one per-search. If your index has changed and you wish to see the changes reflected in searching, you should use DirectoryReader.openIfChanged(DirectoryReader) to obtain a new reader and then create a new IndexSearcher from that. Also, for low-latency turnaround it's best to use a near-real-time reader (DirectoryReader.open(IndexWriter)). Once you have a new Reader, it's relatively cheap to create a new IndexSearcher from it.

NOTE: The search and searchAfter methods are configured to only count top hits accurately up to 1,000 and may return a lower bound of the hit count if the hit count is greater than or equal to 1,000. On queries that match lots of documents, counting the number of hits may take much longer than computing the top hits so this trade-off allows to get some minimal information about the hit count without slowing down search too much. The TopDocs.scoreDocs array is always accurate however. If this behavior doesn't suit your needs, you should create collectors manually with either TopScoreDocCollector.create or TopFieldCollector.create and call search(Query, Collector).

NOTE: IndexSearcher instances are completely thread safe, meaning multiple threads can call any of its methods, concurrently. If your application requires external synchronization, you should not synchronize on the IndexSearcher instance; use your own (non-Lucene) objects instead.

func NewIndexSearcher

func NewIndexSearcher(r index.IndexReader) (*IndexSearcher, error)

func (*IndexSearcher) CollectionStatistics

func (r *IndexSearcher) CollectionStatistics(field string) (*types.CollectionStatistics, error)

CollectionStatistics Returns CollectionStatistics for a field, or null if the field does not exist (has no indexed terms) This can be overridden for example, to return a field's statistics across a distributed collection.

func (*IndexSearcher) GetIndexReader

func (r *IndexSearcher) GetIndexReader() index.IndexReader

func (*IndexSearcher) GetSimilarity

func (r *IndexSearcher) GetSimilarity() index.Similarity

GetSimilarity Expert: Get the Similarity to use to compute scores. This returns the Similarity that has been set through setSimilarity(Similarity) or the default Similarity if none has been set explicitly.

func (*IndexSearcher) GetTopReaderContext

func (r *IndexSearcher) GetTopReaderContext() index.IndexReaderContext

func (*IndexSearcher) Rewrite

func (r *IndexSearcher) Rewrite(query Query) (Query, error)

func (*IndexSearcher) Search

func (r *IndexSearcher) Search(query Query, results Collector) error

func (*IndexSearcher) Search3

func (r *IndexSearcher) Search3(leaves []index.LeafReaderContext, weight Weight, collector Collector) error

func (*IndexSearcher) SearchAfter

func (r *IndexSearcher) SearchAfter(after ScoreDoc, query Query, numHits int) (TopDocs, error)

SearchAfter Finds the top n hits for query where all results are after a previous result (after). By passing the bottom result from a previous page as after, this method can be used for efficient 'deep-paging' across potentially large result sets. Throws: BooleanQuery.TooManyClauses – If a query would exceed BooleanQuery.getMaxClauseCount() clauses.

func (*IndexSearcher) SearchByCollectorManager

func (r *IndexSearcher) SearchByCollectorManager(query Query, collectorManager CollectorManager) (any, error)

SearchByCollectorManager Lower-level search API. Search all leaves using the given CollectorManager. In contrast to search(Query, Collector), this method will use the searcher's Executor in order to parallelize execution of the collection on the configured leafSlices. See Also: CollectorManager lucene.experimental

func (*IndexSearcher) SearchCollector

func (r *IndexSearcher) SearchCollector(query Query, results Collector) error

func (*IndexSearcher) SearchTopN

func (r *IndexSearcher) SearchTopN(query Query, n int) (TopDocs, error)

func (*IndexSearcher) SetQueryCache

func (r *IndexSearcher) SetQueryCache(queryCache QueryCache)

func (*IndexSearcher) SetSimilarity

func (r *IndexSearcher) SetSimilarity(similarity index.Similarity)

func (*IndexSearcher) TermStatistics

func (r *IndexSearcher) TermStatistics(term *index.Term, docFreq, totalTermFreq int) (*types.TermStatistics, error)

TermStatistics Returns TermStatistics for a term. This can be overridden for example, to return a term's statistics across a distributed collection. Params: docFreq – The document frequency of the term. It must be greater or equal to 1. totalTermFreq – The total term frequency. Returns: A TermStatistics (never null).

type IntArrayDocIdSet

type IntArrayDocIdSet struct {
	// contains filtered or unexported fields
}

func NewIntArrayDocIdSet

func NewIntArrayDocIdSet(docs []int) *IntArrayDocIdSet

func (*IntArrayDocIdSet) Bits

func (r *IntArrayDocIdSet) Bits() util.Bits

func (*IntArrayDocIdSet) Iterator

func (r *IntArrayDocIdSet) Iterator() types.DocIdSetIterator

type IntArrayDocIdSetIterator

type IntArrayDocIdSetIterator struct {
	// contains filtered or unexported fields
}

func NewIntArrayDocIdSetIterator

func NewIntArrayDocIdSetIterator(docs []int) *IntArrayDocIdSetIterator

func (*IntArrayDocIdSetIterator) Advance

func (r *IntArrayDocIdSetIterator) Advance(target int) (int, error)

func (*IntArrayDocIdSetIterator) Cost

func (r *IntArrayDocIdSetIterator) Cost() int64

func (*IntArrayDocIdSetIterator) DocID

func (r *IntArrayDocIdSetIterator) DocID() int

func (*IntArrayDocIdSetIterator) NextDoc

func (r *IntArrayDocIdSetIterator) NextDoc() (int, error)

func (*IntArrayDocIdSetIterator) SlowAdvance

func (r *IntArrayDocIdSetIterator) SlowAdvance(target int) (int, error)

type LRUQueryCache

type LRUQueryCache struct {
}

LRUQueryCache A QueryCache that evicts queries using a LRU (least-recently-used) eviction policy in order to remain under a given maximum size and number of bytes used. This class is thread-safe. Note that query eviction runs in linear time with the total number of segments that have cache entries so this cache works best with caching policies that only cache on "large" segments, and it is advised to not share this cache across too many indices. A default query cache and policy instance is used in IndexSearcher. If you want to replace those defaults it is typically done like this:

```

maxNumberOfCachedQueries := 256

```

final int maxNumberOfCachedQueries = 256;
final long maxRamBytesUsed = 50 * 1024L * 1024L; // 50MB
// these cache and policy instances can be shared across several queries and readers
// it is fine to eg. store them into static variables
final QueryCache queryCache = new LRUQueryCache(maxNumberOfCachedQueries, maxRamBytesUsed);
final QueryCachingPolicy defaultCachingPolicy = new UsageTrackingQueryCachingPolicy();
indexSearcher.setQueryCache(queryCache);
indexSearcher.setQueryCachingPolicy(defaultCachingPolicy);

This cache exposes some global statistics (hit count, miss count, number of cache entries, total number of DocIdSets that have ever been cached, number of evicted entries). In case you would like to have more fine-grained statistics, such as per-index or per-query-class statistics, it is possible to override various callbacks: onHit, onMiss, onQueryCache, onQueryEviction, onDocIdSetCache, onDocIdSetEviction and onClear. It is better to not perform heavy computations in these methods though since they are called synchronously and under a lock. See Also: QueryCachingPolicy lucene.experimental

func (*LRUQueryCache) DoCache

func (c *LRUQueryCache) DoCache(weight Weight, policy QueryCachingPolicy) Weight

type LeafCollector

type LeafCollector interface {
	// SetScorer Called before successive calls to collect(int). Implementations that need the score of
	// the current document (passed-in to collect(int)), should save the passed-in Scorer and call
	// scorer.score() when needed.
	//
	// 调用此方法通过Scorer对象获得一篇文档的打分,对文档集合进行排序时,可以作为排序条件之一
	SetScorer(scorer Scorable) error

	// Collect Called once for every document matching a query, with the unbased document number.
	// Note: The collection of the current segment can be terminated by throwing a CollectionTerminatedException.
	// In this case, the last docs of the current org.apache.lucene.index.LeafReaderContext will be skipped
	// and IndexSearcher will swallow the exception and continue collection with the next leaf.
	// Note: This is called in an inner search loop. For good search performance, implementations of this
	// method should not call IndexSearcher.doc(int) or org.apache.lucene.index.Reader.document(int) on
	// every hit. Doing so can slow searches by an order of magnitude or more.
	//
	// 在这个方法中实现了对所有满足查询条件的文档进行
	// 排序(sorting)、过滤(filtering)或者用户自定义的操作的具体逻辑。
	Collect(ctx context.Context, doc int) error

	// CompetitiveIterator Optionally returns an iterator over competitive documents. Collectors should
	// delegate this method to their comparators if their comparators provide the skipping functionality
	// over non-competitive docs. The default is to return null which is interpreted as the collector
	// provide any competitive iterator.
	CompetitiveIterator() (types.DocIdSetIterator, error)
}

LeafCollector Collector decouples the score from the collected doc: the score computation is skipped entirely if it's not needed. Collectors that do need the score should implement the setScorer method, to hold onto the passed Scorer instance, and call Scorer.score() within the collect method to compute the current hit's score. If your collector may request the score for a single hit multiple times, you should use ScoreCachingWrappingScorer.

NOTE: The doc that is passed to the collect method is relative to the current reader. If your collector needs to resolve this to the docID space of the Multi*Reader, you must re-base it by recording the docBase from the most recent setNextReader call. Here's a simple example showing how to collect docIDs into a BitSet:

IndexSearcher searcher = new IndexSearcher(indexReader);
final BitSet bits = new BitSet(indexReader.maxDoc());
searcher.search(query, new Collector() {

  public LeafCollector getLeafCollector(LeafReaderContext context)
      throws IOException {
    final int docBase = context.docBase;
    return new LeafCollector() {

      // ignore scorer
      public void setScorer(Scorer scorer) throws IOException {
      }

      public void collect(int doc) throws IOException {
        bits.set(docBase + doc);
      }

    };
  }

});

Not all collectors will need to rebase the docID. For example, a collector that simply counts the total number of hits would skip it.

type LeafCollectorAnon

type LeafCollectorAnon struct {
	FnSetScorer           func(scorer Scorable) error
	FnCollect             func(ctx context.Context, doc int) error
	FnCompetitiveIterator func() (types.DocIdSetIterator, error)
}

func (*LeafCollectorAnon) Collect

func (l *LeafCollectorAnon) Collect(ctx context.Context, doc int) error

func (*LeafCollectorAnon) CompetitiveIterator

func (l *LeafCollectorAnon) CompetitiveIterator() (types.DocIdSetIterator, error)

func (*LeafCollectorAnon) SetScorer

func (l *LeafCollectorAnon) SetScorer(scorer Scorable) error

type LeafSimScorer

type LeafSimScorer struct {
	// contains filtered or unexported fields
}

func NewLeafSimScorer

func NewLeafSimScorer(scorer index.SimScorer, reader index.LeafReader,
	field string, needsScores bool) (*LeafSimScorer, error)

NewLeafSimScorer org.apache.lucene.search.similarities.Similarity.SimScorer on a specific LeafReader.

func (*LeafSimScorer) Explain

func (r *LeafSimScorer) Explain(doc int, freqExp *types.Explanation) (*types.Explanation, error)

Explain the score for the provided document assuming the given term document frequency. This method must be called on non-decreasing sequences of doc ids. See Also: org.apache.lucene.search.similarities.Similarity.SimScorer.explain(Explanation, long)

func (*LeafSimScorer) GetSimScorer

func (r *LeafSimScorer) GetSimScorer() index.SimScorer

func (*LeafSimScorer) Score

func (r *LeafSimScorer) Score(doc int, freq float64) (float64, error)

type LeafSlice

type LeafSlice struct {
	Leaves []index.LeafReaderContext
}

type LocalHitsThresholdChecker

type LocalHitsThresholdChecker struct {
	// contains filtered or unexported fields
}

LocalHitsThresholdChecker Default implementation of HitsThresholdChecker to be used for single threaded execution

func NewLocalHitsThresholdChecker

func NewLocalHitsThresholdChecker(totalHitsThreshold int) (*LocalHitsThresholdChecker, error)

func (*LocalHitsThresholdChecker) GetHitsThreshold

func (l *LocalHitsThresholdChecker) GetHitsThreshold() int

func (*LocalHitsThresholdChecker) IncrementHitCount

func (l *LocalHitsThresholdChecker) IncrementHitCount()

func (*LocalHitsThresholdChecker) IsThresholdReached

func (l *LocalHitsThresholdChecker) IsThresholdReached() bool

func (*LocalHitsThresholdChecker) ScoreMode

func (l *LocalHitsThresholdChecker) ScoreMode() ScoreMode

type MatchAllDocsQuery

type MatchAllDocsQuery struct {
}

func NewMatchAllDocsQuery

func NewMatchAllDocsQuery() *MatchAllDocsQuery

func (*MatchAllDocsQuery) CreateWeight

func (m *MatchAllDocsQuery) CreateWeight(searcher *IndexSearcher, scoreMode ScoreMode, boost float64) (Weight, error)

func (*MatchAllDocsQuery) Rewrite

func (m *MatchAllDocsQuery) Rewrite(reader index.IndexReader) (Query, error)

func (*MatchAllDocsQuery) String

func (m *MatchAllDocsQuery) String(field string) string

func (*MatchAllDocsQuery) Visit

func (m *MatchAllDocsQuery) Visit(visitor QueryVisitor) error

type MatchNoDocsQuery

type MatchNoDocsQuery struct {
	// contains filtered or unexported fields
}

MatchNoDocsQuery A query that matches no documents.

func NewMatchNoDocsQuery

func NewMatchNoDocsQuery(reason string) *MatchNoDocsQuery

func (*MatchNoDocsQuery) CreateWeight

func (m *MatchNoDocsQuery) CreateWeight(searcher *IndexSearcher, scoreMode ScoreMode, boost float64) (Weight, error)

func (*MatchNoDocsQuery) Rewrite

func (m *MatchNoDocsQuery) Rewrite(reader index.IndexReader) (Query, error)

func (*MatchNoDocsQuery) String

func (m *MatchNoDocsQuery) String(field string) string

func (*MatchNoDocsQuery) Visit

func (m *MatchNoDocsQuery) Visit(visitor QueryVisitor) (err error)

type Matches

type Matches interface {
	Strings() []string

	// GetMatches
	// Returns a MatchesIterator over the matches for a single field, or null if there are no matches
	// in that field.
	GetMatches(field string) (MatchesIterator, error)

	// GetSubMatches
	// Returns a collection of Matches that make up this instance; if it is not a composite,
	// then this returns an empty list
	GetSubMatches() []Matches
}

Matches Reports the positions and optionally offsets of all matching terms in a query for a single document To obtain a MatchesIterator for a particular field, call GetMatches(String). Note that you can call GetMatches(String) multiple times to retrieve new iterators, but it is not thread-safe. 报告单个文档的查询中所有匹配项的位置和可选偏移量,以获取特定字段的匹配迭代器,称为getMatches(String)。 注意,可以多次调用getMatches(String)来检索新的迭代器,但它不是线程安全的。

var MATCH_WITH_NO_TERMS Matches

func MatchesForField

func MatchesForField(field string, mis IOSupplier[MatchesIterator]) Matches

MatchesForField Create a Matches for a single field

func MatchesFromSubMatches

func MatchesFromSubMatches(subMatches []Matches) (Matches, error)

MatchesFromSubMatches Amalgamate a collection of Matches into a single object

type MatchesAnon

type MatchesAnon struct {
	FnStrings       func() []string
	FnGetMatches    func(field string) (MatchesIterator, error)
	FnGetSubMatches func() []Matches
}

func (*MatchesAnon) GetMatches

func (m *MatchesAnon) GetMatches(field string) (MatchesIterator, error)

func (*MatchesAnon) GetSubMatches

func (m *MatchesAnon) GetSubMatches() []Matches

func (*MatchesAnon) Strings

func (m *MatchesAnon) Strings() []string

type MatchesIterator

type MatchesIterator interface {

	// Next
	// Advance the iterator to the next match position
	// Returns: true if matches have not been exhausted
	Next() (bool, error)

	// StartPosition
	// The start position of the current match OccurShould only be called after next() has returned true
	StartPosition() int

	// EndPosition
	// The end position of the current match OccurShould only be called after next() has returned true
	EndPosition() int

	// StartOffset
	// The starting offset of the current match, or -1 if offsets are not available OccurShould only be
	// called after next() has returned true
	StartOffset() (int, error)

	// EndOffset
	// The ending offset of the current match, or -1 if offsets are not available OccurShould only be
	// called after next() has returned true
	EndOffset() (int, error)

	// GetSubMatches
	// Returns a MatchesIterator that iterates over the positions and offsets of individual
	// terms within the current match Returns null if there are no submatches (ie the current iterator is
	// at the leaf level) OccurShould only be called after next() has returned true
	GetSubMatches() (MatchesIterator, error)

	// GetQuery
	// Returns the Query causing the current match If this MatchesIterator has been returned from a
	// getSubMatches() call, then returns a TermQuery equivalent to the current match OccurShould only be called
	// after next() has returned true
	GetQuery() Query
}

MatchesIterator An iterator over match positions (and optionally offsets) for a single document and field To iterate over the matches, call next() until it returns false, retrieving positions and/or offsets after each call. You should not call the position or offset methods before next() has been called, or after next() has returned false. Matches from some queries may span multiple positions. You can retrieve the positions of individual matching terms on the current match by calling getSubMatches(). Matches are ordered by start position, and then by end position. Match intervals may overlap. See Also: Weight.matches(LeafReaderContext, int)

func FromTermsEnumMatchesIterator

func FromTermsEnumMatchesIterator(context index.LeafReaderContext, doc int, query Query,
	field string, terms bytesref.BytesIterator) (MatchesIterator, error)

FromTermsEnumMatchesIterator Create a DisjunctionMatchesIterator over a list of terms extracted from a BytesRefIterator Only terms that have at least one match in the given document will be included

type MaxScoreAccumulator

type MaxScoreAccumulator struct {
	// contains filtered or unexported fields
}

MaxScoreAccumulator Maintains the maximum score and its corresponding document id concurrently

func NewMaxScoreAccumulator

func NewMaxScoreAccumulator() *MaxScoreAccumulator

func (*MaxScoreAccumulator) Accumulate

func (m *MaxScoreAccumulator) Accumulate(docBase int, score float32) error

func (*MaxScoreAccumulator) Get

func (m *MaxScoreAccumulator) Get() *DocAndScore

type MaxScoreCache

type MaxScoreCache struct {
	// contains filtered or unexported fields
}

MaxScoreCache Compute maximum scores based on Impacts and keep them in a cache in order not to run expensive similarity score computations multiple times on the same data.

func NewMaxScoreCache

func NewMaxScoreCache(impactsSource index.ImpactsSource, scorer index.SimScorer) *MaxScoreCache

func (*MaxScoreCache) GetLevel

func (c *MaxScoreCache) GetLevel(upTo int) (int, error)

GetLevel Return the first level that includes all doc IDs up to upTo, or -1 if there is no such level.

func (*MaxScoreCache) GetMaxScoreForLevel

func (c *MaxScoreCache) GetMaxScoreForLevel(level int) (float64, error)

func (*MaxScoreCache) GetSkipUpTo

func (c *MaxScoreCache) GetSkipUpTo(minScore float64) (int, error)

GetSkipUpTo Return the an inclusive upper bound of documents that all have a score that is less than minScore, or -1 if the current document may be competitive.

type MaxScoreSumPropagator

type MaxScoreSumPropagator struct {
	// contains filtered or unexported fields
}

MaxScoreSumPropagator Utility class to propagate scoring information in BooleanQuery, which compute the score as the sum of the scores of its matching clauses. This helps propagate information about the maximum produced score GPT3.5: 这段注释描述的是`MaxScoreSumPropagator`作为一个实用工具类的用途。它用于在布尔查询(BooleanQuery)中传播评分信息, 其中查询的得分是其匹配子查询得分的累加和。这有助于传播关于生成的最大得分的信息。

在布尔查询中,可能包含多个子查询(clauses),如"must"(必须匹配)子查询和"should"(可选匹配)子查询。 每个子查询都会计算出一个相关性得分,表示文档与该子查询的匹配程度。

`MaxScoreSumPropagator`的作用是将这些子查询的得分信息进行传播。它会遍历所有子查询的匹配文档, 对于每个文档,将其得分与之前的最大得分进行比较,并保留最大得分。这样,最终的文档得分就是所有子查询中的最大得分之和。

通过这种传播机制,`MaxScoreSumPropagator`能够确保布尔查询的文档得分是基于各个子查询中最相关的得分进行计算的。 这对于提高搜索结果的质量和排序准确性非常有帮助,因为它可以将最相关的子查询的得分信息传递给最终的文档得分。

总而言之,`MaxScoreSumPropagator`是一个用于在布尔查询中传播评分信息的实用工具类,用于确保最终文档得分是各个子查询中最相关得分的累加和。

func NewMaxScoreSumPropagator

func NewMaxScoreSumPropagator(scorerList []Scorer) (*MaxScoreSumPropagator, error)

func (*MaxScoreSumPropagator) SetMinCompetitiveScore

func (m *MaxScoreSumPropagator) SetMinCompetitiveScore(minScore float64) error

type MultiComparatorLeafCollector

type MultiComparatorLeafCollector struct {
	// contains filtered or unexported fields
}

func NewMultiComparatorLeafCollector

func NewMultiComparatorLeafCollector(comparators []index.LeafFieldComparator, reverseMul []int) *MultiComparatorLeafCollector

func (*MultiComparatorLeafCollector) SetScorer

func (c *MultiComparatorLeafCollector) SetScorer(scorer Scorable) error

type MultiComparatorsFieldValueHitQueue

type MultiComparatorsFieldValueHitQueue struct {
	*FieldValueHitQueueDefault[*Entry]
}

func NewMultiComparatorsFieldValueHitQueue

func NewMultiComparatorsFieldValueHitQueue(fields []index.SortField, size int) *MultiComparatorsFieldValueHitQueue

func (*MultiComparatorsFieldValueHitQueue) Less

func (m *MultiComparatorsFieldValueHitQueue) Less(hitA, hitB *Entry) bool

type MultiLeafFieldComparator

type MultiLeafFieldComparator struct {
	// contains filtered or unexported fields
}

func NewMultiLeafFieldComparator

func NewMultiLeafFieldComparator(comparators []index.LeafFieldComparator, reverseMul []int) *MultiLeafFieldComparator

func (*MultiLeafFieldComparator) CompareBottom

func (m *MultiLeafFieldComparator) CompareBottom(doc int) (int, error)

func (*MultiLeafFieldComparator) CompareTop

func (m *MultiLeafFieldComparator) CompareTop(doc int) (int, error)

func (*MultiLeafFieldComparator) CompetitiveIterator

func (m *MultiLeafFieldComparator) CompetitiveIterator() (types.DocIdSetIterator, error)

func (*MultiLeafFieldComparator) Copy

func (m *MultiLeafFieldComparator) Copy(slot, doc int) error

func (*MultiLeafFieldComparator) SetBottom

func (m *MultiLeafFieldComparator) SetBottom(slot int) error

func (*MultiLeafFieldComparator) SetHitsThresholdReached

func (m *MultiLeafFieldComparator) SetHitsThresholdReached() error

func (*MultiLeafFieldComparator) SetScorer

func (m *MultiLeafFieldComparator) SetScorer(scorer index.Scorable) error

type MultiTermQuery

type MultiTermQuery interface {
	Query

	// GetField
	// Returns the field name for this query
	GetField() string

	// GetTermsEnum
	// Construct the enumeration to be used, expanding the pattern term.
	// This method should only be called if the field exists
	// (ie, implementations can assume the field does exist).
	// This method should not return null (should instead return TermsEnum.EMPTY if no terms match).
	// The TermsEnum must already be positioned to the first matching term.
	// The given AttributeSource is passed by the MultiTermQuery.RewriteMethod to
	// share information between segments, for example TopTermsRewrite uses it to
	// share maximum competitive boosts
	GetTermsEnum(terms index.Terms, atts *attribute.Source) (index.TermsEnum, error)

	// GetRewriteMethod
	// See Also: setRewriteMethod
	GetRewriteMethod() RewriteMethod

	// SetRewriteMethod
	// Sets the rewrite method to be used when executing the query. You can use one of the four core methods,
	// or implement your own subclass of MultiTermQuery.RewriteMethod.
	SetRewriteMethod(method RewriteMethod)
}

MultiTermQuery An abstract Query that matches documents containing a subset of terms provided by a FilteredTermsEnum enumeration. This query cannot be used directly; you must subclass it and define getTermsEnum(Terms, AttributeSource) to provide a FilteredTermsEnum that iterates through the terms to be matched. NOTE: if setRewriteMethod is either CONSTANT_SCORE_BOOLEAN_REWRITE or SCORING_BOOLEAN_REWRITE, you may encounter a BooleanQuery.TooManyClauses exception during searching, which happens when the number of terms to be searched exceeds BooleanQuery.getMaxClauseCount(). Setting setRewriteMethod to ConstantScoreRewrite prevents this. The recommended rewrite method is ConstantScoreRewrite: it doesn't spend CPU computing unhelpful scores, and is the most performant rewrite method given the query. If you need scoring (like FuzzyQuery, use MultiTermQuery.TopTermsScoringBooleanQueryRewrite which uses a priority queue to only collect competitive terms and not hit this limitation. Note that org.apache.lucene.queryparser.classic.QueryParser produces MultiTermQueries using ConstantScoreRewrite by default.

type MultiTermQueryConstantScoreWrapper

type MultiTermQueryConstantScoreWrapper struct {
	// contains filtered or unexported fields
}

func (*MultiTermQueryConstantScoreWrapper) CreateWeight

func (m *MultiTermQueryConstantScoreWrapper) CreateWeight(searcher *IndexSearcher, scoreMode ScoreMode, boost float64) (Weight, error)

func (*MultiTermQueryConstantScoreWrapper) GetField

GetField Returns the field name for this query

func (*MultiTermQueryConstantScoreWrapper) GetQuery

func (*MultiTermQueryConstantScoreWrapper) Rewrite

func (*MultiTermQueryConstantScoreWrapper) String

func (*MultiTermQueryConstantScoreWrapper) Visit

func (m *MultiTermQueryConstantScoreWrapper) Visit(visitor QueryVisitor) (err error)

type MultiTermQueryPlus

type MultiTermQueryPlus interface {
}

type NamedMatches

type NamedMatches struct {
	// contains filtered or unexported fields
}

NamedMatches Utility class to help extract the set of sub queries that have matched from a larger query. Individual subqueries may be wrapped using wrapQuery(String, Query), and the matching queries for a particular document can then be pulled from the parent Query's Matches object by calling findNamedMatches(Matches)

func NewNamedMatches

func NewNamedMatches(in Matches, name string) *NamedMatches

func (*NamedMatches) GetMatches

func (n *NamedMatches) GetMatches(field string) (MatchesIterator, error)

func (*NamedMatches) GetName

func (n *NamedMatches) GetName() string

func (*NamedMatches) GetSubMatches

func (n *NamedMatches) GetSubMatches() []Matches

func (*NamedMatches) Strings

func (n *NamedMatches) Strings() []string

type Occur

type Occur string

Occur Specifies how clauses are to occur in matching documents.

func OccurValues

func OccurValues() []Occur

func (Occur) String

func (o Occur) String() string

type OneComparatorFieldValueHitQueue

type OneComparatorFieldValueHitQueue struct {
	*FieldValueHitQueueDefault[*Entry]
	// contains filtered or unexported fields
}

func NewOneComparatorFieldValueHitQueue

func NewOneComparatorFieldValueHitQueue(fields []index.SortField, size int) *OneComparatorFieldValueHitQueue

func (*OneComparatorFieldValueHitQueue) Less

func (o *OneComparatorFieldValueHitQueue) Less(hitA, hitB *Entry) bool

type PagingFieldCollector

type PagingFieldCollector struct {
	*TopFieldCollector
	*TopDocsCollectorDefault[*Entry]
	// contains filtered or unexported fields
}

func NewPagingFieldCollector

func NewPagingFieldCollector(sort *index.Sort, queue FieldValueHitQueue[*Entry], after FieldDoc, numHits int,
	hitsThresholdChecker HitsThresholdChecker, minScoreAcc *MaxScoreAccumulator) (*PagingFieldCollector, error)

func (*PagingFieldCollector) GetLeafCollector

func (p *PagingFieldCollector) GetLeafCollector(ctx context.Context, readerContext index.LeafReaderContext) (LeafCollector, error)

type PagingTopScoreDocCollector

type PagingTopScoreDocCollector struct {
	*BaseTopScoreDocCollector
	// contains filtered or unexported fields
}

func (*PagingTopScoreDocCollector) GetLeafCollector

func (p *PagingTopScoreDocCollector) GetLeafCollector(ctx context.Context, readerContext index.LeafReaderContext) (LeafCollector, error)

func (*PagingTopScoreDocCollector) NewTopDocs

func (p *PagingTopScoreDocCollector) NewTopDocs(results []ScoreDoc, howMany int) (TopDocs, error)

func (*PagingTopScoreDocCollector) ScoreMode

func (p *PagingTopScoreDocCollector) ScoreMode() ScoreMode

func (*PagingTopScoreDocCollector) TopDocsSize

func (p *PagingTopScoreDocCollector) TopDocsSize() int

type PointInSetQuery

type PointInSetQuery struct {
	// contains filtered or unexported fields
}

type PointRangeQuery

type PointRangeQuery struct {
	// contains filtered or unexported fields
}

func NewPointRangeQuery

func NewPointRangeQuery(field string, lowerPoint []byte, upperPoint []byte, numDims int) (*PointRangeQuery, error)

func (*PointRangeQuery) CreateWeight

func (p *PointRangeQuery) CreateWeight(searcher *IndexSearcher, scoreMode ScoreMode, boost float64) (Weight, error)

func (*PointRangeQuery) Rewrite

func (p *PointRangeQuery) Rewrite(reader index.IndexReader) (Query, error)

func (*PointRangeQuery) String

func (p *PointRangeQuery) String(field string) string

func (*PointRangeQuery) Visit

func (p *PointRangeQuery) Visit(visitor QueryVisitor) (err error)

type PrefixQuery

type PrefixQuery struct {
}

type Query

type Query interface {

	// CreateWeight
	// Expert: Constructs an appropriate Weight implementation for this query.
	// Only implemented by primitive queries, which re-write to themselves.
	// scoreMode: How the produced scorers will be consumed.
	// boost: The boost that is propagated by the parent queries.
	CreateWeight(searcher *IndexSearcher, scoreMode ScoreMode, boost float64) (Weight, error)

	// Rewrite
	// Expert: called to re-write queries into primitive queries. For example, a PrefixQuery will be
	// rewritten into a BooleanQuery that consists of TermQuerys.
	Rewrite(reader index.IndexReader) (Query, error)

	// Visit
	// Recurse through the query tree, visiting any child queries
	// visitor: a QueryVisitor to be called by each query in the tree
	Visit(visitor QueryVisitor) error

	// String
	// Convert a query to a string, with field assumed to be the default field and omitted.
	String(field string) string
}

Query The abstract base class for queries. * Instantiable subclasses are: * TermQuery * BooleanQuery * WildcardQuery * PhraseQuery * PrefixQuery * MultiPhraseQuery * FuzzyQuery * RegexpQuery * TermRangeQuery * PointRangeQuery * ConstantScoreQuery * DisjunctionMaxQuery * MatchAllDocsQuery See also the family of Span Queries and additional queries available in the Queries module

type QueryCache

type QueryCache interface {

	// DoCache
	// Return a wrapper around the provided weight that will cache matching docs per-segment accordingly to
	// the given policy. NOTE: The returned weight will only be equivalent if scores are not needed.
	// See Also: Collector.scoreMode()
	DoCache(weight Weight, policy QueryCachingPolicy) Weight
}

QueryCache A cache for queries. See Also: LRUQueryCache

type QueryCachingPolicy

type QueryCachingPolicy interface {
	// OnUse
	// Callback that is called every time that a cached filter is used. This is typically useful if the
	// policy wants to track usage statistics in order to make decisions.
	OnUse(query Query)

	// ShouldCache
	// Whether the given Query is worth caching. This method will be called by the QueryCache to
	// know whether to cache. It will first attempt to load a DocIdSet from the cache. If it is not cached yet
	// and this method returns true then a cache entry will be generated. Otherwise an uncached scorer will be returned.
	ShouldCache(query Query) (bool, error)
}

QueryCachingPolicy A policy defining which filters should be cached. Implementations of this class must be thread-safe. See Also: UsageTrackingQueryCachingPolicy, LRUQueryCache

type QueryVisitor

type QueryVisitor interface {

	// ConsumeTerms
	// Called by leaf queries that match on specific terms
	// query: the leaf query
	// terms: the terms the query will match on
	ConsumeTerms(query Query, terms ...*index.Term)

	// ConsumeTermsMatching
	// Called by leaf queries that match on a class of terms
	// query: the leaf query
	// field: the field queried against
	// automaton: a supplier for an automaton defining which terms match
	ConsumeTermsMatching(query Query, field string, automaton func() *automaton.ByteRunAutomaton)

	// VisitLeaf
	// Called by leaf queries that do not match on terms
	// query: the query
	VisitLeaf(query Query) (err error)

	// AcceptField
	// Whether or not terms from this field are of interest to the visitor Implement this to
	// avoid collecting terms from heavy queries such as TermInSetQuery that are not running
	// on fields of interest
	AcceptField(field string) bool

	// GetSubVisitor
	// Pulls a visitor instance for visiting child clauses of a query The default implementation
	// returns this, unless occur is equal to BooleanClause.Occur.OccurMustNot in which case it
	// returns EMPTY_VISITOR
	// occur: the relationship between the parent and its children
	// parent: the query visited
	GetSubVisitor(occur Occur, parent Query) QueryVisitor
}

QueryVisitor Allows recursion through a query tree See Also: Query.visit(QueryVisitor)

type ReqExclBulkScorer

type ReqExclBulkScorer struct {
	// contains filtered or unexported fields
}

func (*ReqExclBulkScorer) Cost

func (r *ReqExclBulkScorer) Cost() int64

func (*ReqExclBulkScorer) Score

func (r *ReqExclBulkScorer) Score(collector LeafCollector, acceptDocs util.Bits) error

func (*ReqExclBulkScorer) ScoreRange

func (r *ReqExclBulkScorer) ScoreRange(collector LeafCollector, acceptDocs util.Bits, minDoc, maxDoc int) (int, error)

type ReqExclScorer

type ReqExclScorer struct {
	*BaseScorer
	// contains filtered or unexported fields
}

ReqExclScorer A Scorer for queries with a required subscorer and an excluding (prohibited) sub Scorer.

func NewReqExclScorer

func NewReqExclScorer(reqScorer, exclScorer Scorer) *ReqExclScorer

func (*ReqExclScorer) DocID

func (r *ReqExclScorer) DocID() int

func (*ReqExclScorer) GetMaxScore

func (r *ReqExclScorer) GetMaxScore(upTo int) (float64, error)

func (*ReqExclScorer) Iterator

func (r *ReqExclScorer) Iterator() types.DocIdSetIterator

func (*ReqExclScorer) Score

func (r *ReqExclScorer) Score() (float64, error)

func (*ReqExclScorer) TwoPhaseIterator

func (r *ReqExclScorer) TwoPhaseIterator() TwoPhaseIterator

type ReqOptSumScorer

type ReqOptSumScorer struct {
	*BaseScorer
	// contains filtered or unexported fields
}

ReqOptSumScorer A Scorer for queries with a required part and an optional part. Delays skipTo() on the optional part until a score() is needed.

GPT3.5:

在Lucene中,ReqOptSumScorer是一个用于布尔查询的评分器(scorer)。 它是由ReqScorer和OptScorer组合而成,用于计算布尔查询的相关性得分。

ReqScorer(必须匹配评分器)是一个评分器,用于计算满足所有必须(必须出现)子查询的文档的得分。 它将文档与每个子查询进行匹配,并将匹配的文档的得分进行累加。ReqScorer的得分是所有必须子查询的得分之和。

OptScorer(可选匹配评分器)是一个评分器,用于计算满足任意可选(可选出现)子查询的文档的得分。 它将文档与每个可选子查询进行匹配,并将匹配的文档的得分进行累加。OptScorer的得分是所有可选子查询的得分之和。

ReqOptSumScorer将ReqScorer和OptScorer的得分进行相加,得到最终的文档得分。 这意味着文档必须匹配所有必须子查询,并且可以匹配任意可选子查询。

使用ReqOptSumScorer可以实现布尔查询的组合逻辑,例如"must"(必须匹配)和"should"(可选匹配)的组合。 它允许您根据查询要求计算文档的相关性得分,并根据得分对文档进行排序和排名。

请注意,以上是对ReqOptSumScorer的基本解释,实际的实现和使用方式可能会有所不同,具体取决于Lucene版本和上下文环境。

func NewReqOptSumScorer

func NewReqOptSumScorer(reqScorer, optScorer Scorer, scoreMode ScoreMode) (*ReqOptSumScorer, error)

NewReqOptSumScorer Construct a ReqOptScorer. reqScorer: The required scorer. This must match. optScorer: The optional scorer. This is used for scoring only. scoreMode: How the produced scorers will be consumed.

func (*ReqOptSumScorer) AdvanceShallow

func (r *ReqOptSumScorer) AdvanceShallow(target int) (int, error)

func (*ReqOptSumScorer) DocID

func (r *ReqOptSumScorer) DocID() int

func (*ReqOptSumScorer) GetMaxScore

func (r *ReqOptSumScorer) GetMaxScore(upTo int) (float64, error)

func (*ReqOptSumScorer) Iterator

func (r *ReqOptSumScorer) Iterator() types.DocIdSetIterator

func (*ReqOptSumScorer) Score

func (r *ReqOptSumScorer) Score() (float64, error)

func (*ReqOptSumScorer) SetMinCompetitiveScore

func (r *ReqOptSumScorer) SetMinCompetitiveScore(minScore float64) error

func (*ReqOptSumScorer) TwoPhaseIterator

func (r *ReqOptSumScorer) TwoPhaseIterator() TwoPhaseIterator

type RewriteMethod

type RewriteMethod interface {
	Rewrite(reader index.IndexReader, query MultiTermQuery) (Query, error)

	// GetTermsEnum
	// Returns the MultiTermQuerys TermsEnum
	// See Also: getTermsEnum(Terms, AttributeSource)
	GetTermsEnum(query MultiTermQuery, terms index.Terms, atts *attribute.Source) (index.TermsEnum, error)
}

RewriteMethod Abstract class that defines how the query is rewritten.

type Scorable

type Scorable interface {
	// Score
	// Returns the Score of the current document matching the query.
	Score() (float64, error)

	// SmoothingScore
	// Returns the smoothing Score of the current document matching the query. This Score
	// is used when the query/term does not appear in the document, and behaves like an idf. The smoothing
	// Score is particularly important when the Scorer returns a product of probabilities so that the
	// document Score does not go to zero when one probability is zero. This can return 0 or a smoothing Score.
	//
	// Smoothing scores are described in many papers, including: Metzler, D. and Croft, W. B. , "Combining
	// the Language Model and Inference Network Approaches to Retrieval," Information Processing and Management
	// Special Issue on Bayesian Networks and Information Retrieval, 40(5), pp.735-750.
	SmoothingScore(docId int) (float64, error)

	// DocID
	// Returns the doc ID that is currently being scored.
	DocID() int

	// SetMinCompetitiveScore
	// Optional method: Tell the scorer that its iterator may safely ignore all
	// documents whose Score is less than the given minScore. This is a no-op by default. This method
	// may only be called from collectors that use ScoreMode.TOP_SCORES, and successive calls may
	// only set increasing values of minScore.
	SetMinCompetitiveScore(minScore float64) error

	// GetChildren
	// Returns child sub-scorers positioned on the current document
	GetChildren() ([]ChildScorable, error)
}

Scorable Allows access to the Score of a Query 允许访问查询的分数

type ScoreAndDoc

type ScoreAndDoc struct {
	*BaseScorable
	// contains filtered or unexported fields
}

func NewScoreAndDoc

func NewScoreAndDoc() *ScoreAndDoc

func (*ScoreAndDoc) DocID

func (s *ScoreAndDoc) DocID() int

func (*ScoreAndDoc) Score

func (s *ScoreAndDoc) Score() (float64, error)

type ScoreDoc

type ScoreDoc interface {
	GetScore() float64
	SetScore(score float64)
	GetDoc() int
	SetDoc(doc int)
	GetShardIndex() int
	SetShardIndex(shardIndex int)
}

type ScoreMergeSortQueue

type ScoreMergeSortQueue struct {
	*structure.PriorityQueue[*ShardRef]
	// contains filtered or unexported fields
}

func NewScoreMergeSortQueue

func NewScoreMergeSortQueue(shardHits []TopDocs) *ScoreMergeSortQueue

type ScoreMode

type ScoreMode uint8

ScoreMode Different modes of search.

func NewScoreMode

func NewScoreMode(isExhaustive, needsScores bool) ScoreMode

func (ScoreMode) IsExhaustive

func (r ScoreMode) IsExhaustive() bool

IsExhaustive Returns true if for this ScoreMode it is necessary to process all documents, or false if is enough to go through top documents only.

func (ScoreMode) NeedsScores

func (r ScoreMode) NeedsScores() bool

NeedsScores Whether this ScoreMode needs to compute scores.

type Scorer

type Scorer interface {
	Scorable

	// GetWeight
	// returns parent Weight
	GetWeight() Weight

	// Iterator
	// Return a DocIdSetIterator over matching documents. The returned iterator will either
	// be positioned on -1 if no documents have been scored yet, DocIdSetIterator.NO_MORE_DOCS if all
	// documents have been scored already, or the last document id that has been scored otherwise.
	// The returned iterator is a view: calling this method several times will return iterators
	// that have the same state.
	Iterator() types.DocIdSetIterator

	// TwoPhaseIterator
	// Optional method: Return a TwoPhaseIterator view of this Scorer. A return value
	// of null indicates that two-phase iteration is not supported. Note that the returned
	// TwoPhaseIterator's approximation must advance synchronously with the iterator(): advancing
	// the approximation must advance the iterator and vice-versa. Implementing this method is
	// typically useful on Scorers that have a high per-document overhead in order to confirm
	// matches. The default implementation returns null.
	TwoPhaseIterator() TwoPhaseIterator

	// AdvanceShallow
	// Advance to the block of documents that contains target in order to get scoring information
	// about this block. This method is implicitly called by DocIdSetIterator.advance(int) and
	// DocIdSetIterator.nextDoc() on the returned doc ID. Calling this method doesn't modify the
	// current DocIdSetIterator.docID(). It returns a number that is greater than or equal to all
	// documents contained in the current block, but less than any doc IDS of the next block.
	// target must be >= docID() as well as all targets that have been passed to advanceShallow(int) so far.
	AdvanceShallow(target int) (int, error)

	// GetMaxScore
	// Return the maximum score that documents between the last target that this iterator
	// was shallow-advanced to included and upTo included.
	GetMaxScore(upTo int) (float64, error)
}

Scorer Expert: Common scoring functionality for different types of queries. 不同类型查询的通用评分功能。

A Scorer exposes an iterator() over documents matching a query in increasing order of doc Id. 计分器暴露一个迭代器,这个迭代器按照文档id递增顺序

Document scores are computed using a given Similarity implementation. NOTE: The values Float.Nan, Float.NEGATIVE_INFINITY and Float.POSITIVE_INFINITY are not valid scores. Certain collectors (eg TopScoreDocCollector) will not properly collect hits with these scores.

type ScorerLeafCollector

type ScorerLeafCollector struct {
	// contains filtered or unexported fields
}

func (*ScorerLeafCollector) SetScorer

func (s *ScorerLeafCollector) SetScorer(scorer Scorable) error

type ScorerSupplier

type ScorerSupplier interface {
	// Get
	// Get the Scorer. This may not return null and must be called at most once.
	// leadCost: Cost of the scorer that will be used in order to lead iteration. This can be
	//			interpreted as an upper bound of the number of times that DocIdSetIterator.nextDoc,
	//			DocIdSetIterator.advance and TwoPhaseIterator.matches will be called. Under doubt,
	//			pass Long.MAX_VALUE, which will produce a Scorer that has good iteration capabilities.
	Get(leadCost int64) (Scorer, error)

	// Cost
	// Get an estimate of the Scorer that would be returned by get. This may be a costly operation,
	// so it should only be called if necessary.
	// See Also: DocIdSetIterator.cost
	Cost() int64
}

type ScorerSupplierDefault

type ScorerSupplierDefault struct {
}

type SegmentCacheable

type SegmentCacheable interface {

	// IsCacheable
	// Returns: true if the object can be cached against a given leaf
	IsCacheable(ctx index.LeafReaderContext) bool
}

SegmentCacheable Interface defining whether or not an object can be cached against a LeafReader Objects that depend only on segment-immutable structures such as Points or postings lists can just return true from isCacheable(LeafReaderContext) Objects that depend on doc values should return DocValues.isCacheable(LeafReaderContext, String...), which will check to see if the doc values fields have been updated. Updated doc values fields are not suitable for cacheing. Objects that are not segment-immutable, such as those that rely on global statistics or scores, should return false

type ShardRef

type ShardRef struct {
	// contains filtered or unexported fields
}

ShardRef Refers to one hit:

func NewShardRef

func NewShardRef(shardIndex int, useScoreDocIndex bool) *ShardRef

func (*ShardRef) GetShardIndex

func (s *ShardRef) GetShardIndex(scoreDoc ScoreDoc) int

type SimpleCollector

type SimpleCollector interface {
	Collector
	LeafCollector

	// DoSetNextReader
	// This method is called before collecting context.
	DoSetNextReader(context index.LeafReaderContext) error
}

SimpleCollector Base Collector implementation that is used to collect all contexts.

type SimpleCollectorSPI

type SimpleCollectorSPI interface {
	DoSetNextReader(context index.LeafReaderContext) error
	SetScorer(scorer Scorable) error
	Collect(ctx context.Context, doc int) error
}

type SimpleFieldCollector

type SimpleFieldCollector struct {
	*TopFieldCollector
	*TopDocsCollectorDefault[*Entry]
	// contains filtered or unexported fields
}

func NewSimpleFieldCollector

func NewSimpleFieldCollector(sort *index.Sort, queue FieldValueHitQueue[*Entry], numHits int,
	hitsThresholdChecker HitsThresholdChecker, minScoreAcc *MaxScoreAccumulator) (*SimpleFieldCollector, error)

func (*SimpleFieldCollector) GetLeafCollector

func (s *SimpleFieldCollector) GetLeafCollector(ctx context.Context, readerContext index.LeafReaderContext) (LeafCollector, error)

type SimpleTopScoreDocCollector

type SimpleTopScoreDocCollector struct {
	*BaseTopScoreDocCollector
}

func (*SimpleTopScoreDocCollector) GetLeafCollector

func (s *SimpleTopScoreDocCollector) GetLeafCollector(ctx context.Context, readerContext index.LeafReaderContext) (LeafCollector, error)

func (*SimpleTopScoreDocCollector) ScoreMode

func (s *SimpleTopScoreDocCollector) ScoreMode() ScoreMode

type StartDISIWrapper

type StartDISIWrapper struct {
	// contains filtered or unexported fields
}

func NewStartDISIWrapper

func NewStartDISIWrapper(in types.DocIdSetIterator) *StartDISIWrapper

func (*StartDISIWrapper) Advance

func (s *StartDISIWrapper) Advance(target int) (int, error)

func (*StartDISIWrapper) Cost

func (s *StartDISIWrapper) Cost() int64

func (*StartDISIWrapper) DocID

func (s *StartDISIWrapper) DocID() int

func (*StartDISIWrapper) NextDoc

func (s *StartDISIWrapper) NextDoc() (int, error)

func (*StartDISIWrapper) SlowAdvance

func (s *StartDISIWrapper) SlowAdvance(target int) (int, error)

type Supplier

type Supplier[T any] interface {
	Get() T
}

type TermInSetQuery

type TermInSetQuery struct {
	// contains filtered or unexported fields
}

TermInSetQuery Specialization for a disjunction over many terms that behaves like a ConstantScoreQuery over a BooleanQuery containing only BooleanClause.Occur.OccurShould clauses.

For instance in the following example, both q1 and q2 would yield the same scores: Query q1 = new TermInSetQuery("field", new BytesRef("foo"), new BytesRef("bar"));

BooleanQuery bq = new BooleanQuery(); bq.add(new TermQuery(new Term("field", "foo")), Occur.OccurShould); bq.add(new TermQuery(new Term("field", "bar")), Occur.OccurShould); Query q2 = new ConstantScoreQuery(bq);

When there are few terms, this query executes like a regular disjunction. However, when there are many terms, instead of merging iterators on the fly, it will populate a bit set with matching docs and return a Scorer over this bit set.

NOTE: This query produces scores that are equal to its boost

type TermMatchesIterator

type TermMatchesIterator struct {
	// contains filtered or unexported fields
}

TermMatchesIterator A MatchesIterator over a single term's postings list

func NewTermMatchesIterator

func NewTermMatchesIterator(query Query, pe index.PostingsEnum) (*TermMatchesIterator, error)

func (*TermMatchesIterator) EndOffset

func (t *TermMatchesIterator) EndOffset() (int, error)

func (*TermMatchesIterator) EndPosition

func (t *TermMatchesIterator) EndPosition() int

func (*TermMatchesIterator) GetQuery

func (t *TermMatchesIterator) GetQuery() Query

func (*TermMatchesIterator) GetSubMatches

func (t *TermMatchesIterator) GetSubMatches() (MatchesIterator, error)

func (*TermMatchesIterator) Next

func (t *TermMatchesIterator) Next() (bool, error)

func (*TermMatchesIterator) StartOffset

func (t *TermMatchesIterator) StartOffset() (int, error)

func (*TermMatchesIterator) StartPosition

func (t *TermMatchesIterator) StartPosition() int

type TermQuery

type TermQuery struct {
	// contains filtered or unexported fields
}

TermQuery A Query that matches documents containing a term. This may be combined with other terms with a BooleanQuery.

func NewTermQuery

func NewTermQuery(term *index.Term) *TermQuery

func NewTermQueryV1

func NewTermQueryV1(term *index.Term, states *index.TermStates) *TermQuery

NewTermQueryV1 Expert: constructs a TermQuery that will use the provided docFreq instead of looking up the docFreq against the searcher.

func (*TermQuery) CreateWeight

func (t *TermQuery) CreateWeight(searcher *IndexSearcher, scoreMode ScoreMode, boost float64) (Weight, error)

func (*TermQuery) GetTerm

func (t *TermQuery) GetTerm() *index.Term

func (*TermQuery) NewTermWeight

func (t *TermQuery) NewTermWeight(searcher *IndexSearcher, scoreMode ScoreMode,
	boost float64, termStates *index.TermStates) (*TermWeight, error)

func (*TermQuery) Rewrite

func (t *TermQuery) Rewrite(reader index.IndexReader) (Query, error)

func (*TermQuery) String

func (t *TermQuery) String(field string) string

func (*TermQuery) Visit

func (t *TermQuery) Visit(visitor QueryVisitor) error

type TermRangeQuery

type TermRangeQuery struct {
}

type TermScorer

type TermScorer struct {
	*BaseScorer
	// contains filtered or unexported fields
}

TermScorer Expert: A Scorer for documents matching a Term.

func NewTermScorerWithImpacts

func NewTermScorerWithImpacts(weight Weight, impactsEnum index.ImpactsEnum, docScorer *LeafSimScorer) *TermScorer

func NewTermScorerWithPostings

func NewTermScorerWithPostings(weight Weight, postingsEnum index.PostingsEnum, docScorer *LeafSimScorer) *TermScorer

func (*TermScorer) DocID

func (t *TermScorer) DocID() int

func (*TermScorer) Freq

func (t *TermScorer) Freq() (int, error)

func (*TermScorer) GetChildren

func (t *TermScorer) GetChildren() ([]ChildScorable, error)

func (*TermScorer) GetMaxScore

func (t *TermScorer) GetMaxScore(upTo int) (float64, error)

func (*TermScorer) GetWeight

func (t *TermScorer) GetWeight() Weight

func (*TermScorer) Iterator

func (t *TermScorer) Iterator() types.DocIdSetIterator

func (*TermScorer) Score

func (t *TermScorer) Score() (float64, error)

func (*TermScorer) SetMinCompetitiveScore

func (t *TermScorer) SetMinCompetitiveScore(minScore float64) error

func (*TermScorer) SmoothingScore

func (t *TermScorer) SmoothingScore(docId int) (float64, error)

func (*TermScorer) TwoPhaseIterator

func (t *TermScorer) TwoPhaseIterator() TwoPhaseIterator

type TermWeight

type TermWeight struct {
	*BaseWeight
	*TermQuery
	// contains filtered or unexported fields
}

func (*TermWeight) Explain

func (t *TermWeight) Explain(context index.LeafReaderContext, doc int) (*types.Explanation, error)

func (*TermWeight) ExtractTerms

func (t *TermWeight) ExtractTerms(terms *treeset.Set[*index.Term]) error

func (*TermWeight) GetQuery

func (t *TermWeight) GetQuery() Query

func (*TermWeight) IsCacheable

func (t *TermWeight) IsCacheable(ctx index.LeafReaderContext) bool

func (*TermWeight) Scorer

func (t *TermWeight) Scorer(ctx index.LeafReaderContext) (Scorer, error)

type TimSort

type TimSort []types.DocIdSetIterator

func (TimSort) Len

func (t TimSort) Len() int

func (TimSort) Less

func (t TimSort) Less(i, j int) bool

func (TimSort) Swap

func (t TimSort) Swap(i, j int)

type TimSortBitSet

type TimSortBitSet []*index.BitSetIterator

func (TimSortBitSet) Len

func (t TimSortBitSet) Len() int

func (TimSortBitSet) Less

func (t TimSortBitSet) Less(i, j int) bool

func (TimSortBitSet) Swap

func (t TimSortBitSet) Swap(i, j int)

type TimSortTwoPhase

type TimSortTwoPhase []TwoPhaseIterator

func (TimSortTwoPhase) Len

func (t TimSortTwoPhase) Len() int

func (TimSortTwoPhase) Less

func (t TimSortTwoPhase) Less(i, j int) bool

func (TimSortTwoPhase) Swap

func (t TimSortTwoPhase) Swap(i, j int)

type TopDocs

type TopDocs interface {
	GetTotalHits() *TotalHits
	GetScoreDocs() []ScoreDoc
}

func MergeTopDocs

func MergeTopDocs(start, topN int, shardHits []TopDocs, setShardIndex bool) (TopDocs, error)

type TopDocsCollector

type TopDocsCollector interface {
	Collector

	// PopulateResults
	// Populates the results array with the ScoreDoc instances.
	// This can be overridden in case a different ScoreDoc type should be returned.
	PopulateResults(results []ScoreDoc, howMany int) error

	// NewTopDocs
	// Returns a TopDocs instance containing the given results.
	// If results is null it means there are no results to return, either because
	// there were 0 calls to collect() or because the arguments to topDocs were invalid.
	NewTopDocs(results []ScoreDoc, howMany int) (TopDocs, error)

	// GetTotalHits
	// The total number of documents that matched this query.
	GetTotalHits() int

	// TopDocsSize
	// The number of valid PQ entries
	TopDocsSize() int

	// TopDocs
	// Returns the top docs that were collected by this collector.
	TopDocs() (TopDocs, error)

	// TopDocsFrom
	// Returns the documents in the range [start .. pq.size()) that were collected by this collector.
	// Note that if start >= pq.size(), an empty TopDocs is returned. This method is convenient to
	// call if the application always asks for the last results, starting from the last 'page'.
	// NOTE: you cannot call this method more than once for each search execution.
	// If you need to call it more than once, passing each time a different start,
	// you should call topDocs() and work with the returned TopDocs object,
	// which will contain all the results this search execution collected.
	TopDocsFrom(start int) (TopDocs, error)

	// TopDocsRange
	// Returns the documents in the range [start .. start+howMany) that were collected by this collector.
	// Note that if start >= pq.size(), an empty TopDocs is returned, and if pq.size() - start < howMany,
	// then only the available documents in [start .. pq.size()) are returned.
	// This method is useful to call in case pagination of search results is allowed by the search application,
	// as well as it attempts to optimize the memory used by allocating only as much as requested by howMany.
	// NOTE: you cannot call this method more than once for each search execution.
	// If you need to call it more than once, passing each time a different range,
	// you should call topDocs() and work with the returned TopDocs object,
	// which will contain all the results this search execution collected.
	TopDocsRange(start, howMany int) (TopDocs, error)
}

TopDocsCollector A base class for all collectors that return a TopDocs output. This collector allows easy extension by providing a single constructor which accepts a PriorityQueue as well as protected members for that priority queue and a counter of the number of total hits. Extending classes can override any of the methods to provide their own implementation, as well as avoid the use of the priority queue entirely by passing null to TopDocsCollector(PriorityQueue). In that case however, you might want to consider overriding all methods, in order to avoid a NullPointerException.

func TopTopFieldCollectorCreate

func TopTopFieldCollectorCreate(sort *index.Sort, numHits int, after FieldDoc,
	hitsThresholdChecker HitsThresholdChecker, minScoreAcc *MaxScoreAccumulator) (TopDocsCollector, error)

type TopDocsCollectorDefault

type TopDocsCollectorDefault[T ScoreDoc] struct {
	// contains filtered or unexported fields
}

func (*TopDocsCollectorDefault[T]) GetTotalHits

func (t *TopDocsCollectorDefault[T]) GetTotalHits() int

func (*TopDocsCollectorDefault[T]) NewTopDocs

func (t *TopDocsCollectorDefault[T]) NewTopDocs(results []ScoreDoc, howMany int) (TopDocs, error)

func (*TopDocsCollectorDefault[T]) PopulateResults

func (t *TopDocsCollectorDefault[T]) PopulateResults(results []ScoreDoc, howMany int) error

func (*TopDocsCollectorDefault[T]) TopDocs

func (t *TopDocsCollectorDefault[T]) TopDocs() (TopDocs, error)

func (*TopDocsCollectorDefault[T]) TopDocsFrom

func (t *TopDocsCollectorDefault[T]) TopDocsFrom(start int) (TopDocs, error)

func (*TopDocsCollectorDefault[T]) TopDocsRange

func (t *TopDocsCollectorDefault[T]) TopDocsRange(start, howMany int) (TopDocs, error)

func (*TopDocsCollectorDefault[T]) TopDocsSize

func (t *TopDocsCollectorDefault[T]) TopDocsSize() int

type TopFieldCollector

type TopFieldCollector struct {
	*TopDocsCollectorDefault[*Entry]
	// contains filtered or unexported fields
}

func (*TopFieldCollector) ScoreMode

func (t *TopFieldCollector) ScoreMode() ScoreMode

type TopFieldDocs

type TopFieldDocs struct {
	*BaseTopDocs
	// contains filtered or unexported fields
}

func NewTopFieldDocs

func NewTopFieldDocs(totalHits *TotalHits, scoreDocs []ScoreDoc, fields []index.SortField) *TopFieldDocs

NewTopFieldDocs Creates one of these objects. Params:

totalHits – Total number of hits for the query.
scoreDocs – The top hits for the query.
fields – The sort criteria used to find the top hits.

func (*TopFieldDocs) GetFields

func (t *TopFieldDocs) GetFields() []index.SortField

type TopScoreDocCollector

type TopScoreDocCollector interface {
	TopDocsCollector
}

TopScoreDocCollector A Collector implementation that collects the top-scoring hits, returning them as a TopDocs. This is used by IndexSearcher to implement TopDocs-based search. Hits are sorted by score descending and then (when the scores are tied) docID ascending. When you create an instance of this collector you should know in advance whether documents are going to be collected in doc Id order or not.

NOTE: The values Float.NaN and Float.NEGATIVE_INFINITY are not valid scores. This collector will not properly collect hits with such scores.

func NewPagingTopScoreDocCollector

func NewPagingTopScoreDocCollector(hits int, after ScoreDoc, checker HitsThresholdChecker, acc *MaxScoreAccumulator) (TopScoreDocCollector, error)

func NewSimpleTopScoreDocCollector

func NewSimpleTopScoreDocCollector(numHits int, hitsThresholdChecker HitsThresholdChecker,
	minScoreAcc *MaxScoreAccumulator) (TopScoreDocCollector, error)

func TopScoreDocCollectorCreate

func TopScoreDocCollectorCreate(numHits int, after ScoreDoc,
	hitsThresholdChecker HitsThresholdChecker, minScoreAcc *MaxScoreAccumulator) (TopScoreDocCollector, error)

type TotalHits

type TotalHits struct {
	Value    int64
	Relation TotalHitsRelation
}

TotalHits Description of the total number of hits of a query. The total hit count can't generally be computed accurately without visiting all matches, which is costly for queries that match lots of documents. Given that it is often enough to have a lower bounds of the number of hits, such as "there are more than 1000 hits", Lucene has options to stop counting as soon as a threshold has been reached in order to improve query times.

func NewTotalHits

func NewTotalHits(value int64, relation TotalHitsRelation) *TotalHits

type TotalHitsRelation

type TotalHitsRelation int

TotalHitsRelation How the value should be interpreted.

type TwoPhase

type TwoPhase struct {
}

func (*TwoPhase) Approximation

func (t *TwoPhase) Approximation() types.DocIdSetIterator

func (*TwoPhase) MatchCost

func (t *TwoPhase) MatchCost() float64

func (*TwoPhase) Matches

func (t *TwoPhase) Matches() (bool, error)

type TwoPhaseIterator

type TwoPhaseIterator interface {
	Approximation() types.DocIdSetIterator

	// Matches
	// Return whether the current doc ID that approximation() is on matches.
	// This method should only be called when the iterator is positioned -- ie. not when DocIdSetIterator.docID() is -1 or DocIdSetIterator.NO_MORE_DOCS -- and at most once.
	Matches() (bool, error)

	// MatchCost
	// An estimate of the expected cost to determine that a single document matches().
	// This can be called before iterating the documents of approximation().
	// Returns an expected cost in number of simple operations like addition, multiplication, comparing two numbers and indexing an array. The returned value must be positive.
	MatchCost() float64
}

TwoPhaseIterator Returned by Scorer.TwoPhaseIterator() to expose an approximation of a DocIdSetIterator. When the approximation()'s DocIdSetIterator.nextDoc() or DocIdSetIterator.advance(int) return, matches() needs to be checked in order to know whether the returned doc ID actually matches.

GPT3.5

在Lucene中,`TwoPhaseIterator`是一个用于执行两阶段迭代的工具类。 它可以与Scorer一起使用,用于更高效地过滤和评分匹配文档。

在搜索过程中,通常会使用一个Scorer来进行文档匹配,并对匹配的文档进行评分。然而,有时候在进行文档匹配之前, 可以使用一些更快速的方法来过滤掉不符合条件的文档,从而减少评分操作的开销。

`TwoPhaseIterator`类就提供了这样的功能。它通过两个阶段的迭代来实现过滤和评分的分离。

在第一阶段,`TwoPhaseIterator`会对文档进行快速的过滤操作,根据一些预先计算的条件(例如,布尔表达式或位集合), 判断文档是否可能匹配查询条件。这个过滤操作通常比完全匹配文档的评分操作更快。

在第二阶段,对于通过第一阶段过滤的文档,`TwoPhaseIterator`会将这些文档传递给实际的Scorer进行详细的匹配和评分操作。

使用`TwoPhaseIterator`的好处是,它可以减少不必要的评分操作,只对通过过滤的文档进行实际的匹配和评分,从而提高搜索性能。

`TwoPhaseIterator`类主要包含以下方法:

1. `approximation()`:返回用于快速过滤的近似评分器(approximation scorer)。

2. `matches()`:在第一阶段中,检查当前文档是否匹配查询条件。

3. `matchCost()`:返回第一阶段中过滤操作的成本。用于估算在第一阶段过滤后剩余的文档数量。

通过使用`TwoPhaseIterator`,可以在搜索过程中根据具体需求进行过滤和评分的优化,提高搜索性能并降低开销。

func UnwrapIterator

func UnwrapIterator(iterator types.DocIdSetIterator) TwoPhaseIterator

type UsageTrackingQueryCachingPolicy

type UsageTrackingQueryCachingPolicy struct {
}

UsageTrackingQueryCachingPolicy A QueryCachingPolicy that tracks usage statistics of recently-used filters in order to decide on which filters are worth caching.

func (*UsageTrackingQueryCachingPolicy) OnUse

func (u *UsageTrackingQueryCachingPolicy) OnUse(query Query)

func (*UsageTrackingQueryCachingPolicy) ShouldCache

func (u *UsageTrackingQueryCachingPolicy) ShouldCache(query Query) (bool, error)

type WANDScorer

type WANDScorer struct {
	*BaseScorer
	// contains filtered or unexported fields
}

WANDScorer This implements the WAND (Weak AND) algorithm for dynamic pruning described in "Efficient Query Evaluation using a Two-Level Retrieval Process" by Broder, Carmel, Herscovici, Soffer and Zien. Enhanced with techniques described in "Faster Top-k Document Retrieval Using Block-Max Indexes" by Ding and Suel. For scoreMode == ScoreMode.TOP_SCORES, this scorer maintains a feedback loop with the collector in order to know at any time the minimum score that is required in order for a hit to be competitive.

The implementation supports both minCompetitiveScore by enforce that ∑ max_score >= minCompetitiveScore, and minShouldMatch by enforcing freq >= minShouldMatch. It keeps sub scorers in 3 different places: - tail: a heap that contains scorers that are behind the desired doc ID. These scorers are ordered by cost so that we can advance the least costly ones first. - lead: a linked list of scorer that are positioned on the desired doc ID - head: a heap that contains scorers which are beyond the desired doc ID, ordered by doc ID in order to move quickly to the next candidate. When scoreMode == ScoreMode.TOP_SCORES, it leverages the max score from each scorer in order to know when it may call DocIdSetIterator.advance rather than DocIdSetIterator.nextDoc to move to the next competitive hit. When scoreMode != ScoreMode.TOP_SCORES, block-max scoring related logic is skipped. Finding the next match consists of first setting the desired doc ID to the least entry in 'head', and then advance 'tail' until there is a match, by meeting the configured freq >= minShouldMatch and / or ∑ max_score >= minCompetitiveScore requirements.

func (*WANDScorer) DocID

func (w *WANDScorer) DocID() int

func (*WANDScorer) GetMaxScore

func (w *WANDScorer) GetMaxScore(upTo int) (float64, error)

func (*WANDScorer) Iterator

func (w *WANDScorer) Iterator() types.DocIdSetIterator

func (*WANDScorer) Score

func (w *WANDScorer) Score() (float64, error)

type Weight

type Weight interface {
	SegmentCacheable

	ExtractTerms(terms *treeset.Set[*index.Term]) error

	// Matches
	// Returns Matches for a specific document, or null if the document does not match the parent query
	// A query match that contains no position information (for example, a Point or DocValues query) will
	// return MatchesUtils.MATCH_WITH_NO_TERMS
	// context: the reader's context to create the Matches for
	// doc: the document's id relative to the given context's reader
	Matches(readerContext index.LeafReaderContext, doc int) (Matches, error)

	// Explain
	// An explanation of the score computation for the named document.
	// context: the readers context to create the Explanation for.
	// doc: the document's id relative to the given context's reader
	// Returns: an Explanation for the score
	// Throws: 	IOException – if an IOException occurs
	Explain(readerContext index.LeafReaderContext, doc int) (*types.Explanation, error)

	// GetQuery The query that this concerns.
	GetQuery() Query

	// Scorer
	// Returns a Scorer which can iterate in order over all matching documents and assign them a score.
	// NOTE: null can be returned if no documents will be scored by this query.
	// NOTE: The returned Scorer does not have LeafReader.getLiveDocs() applied, they need to be checked on top.
	// ctx: the LeafReaderContext for which to return the Scorer.
	// a Scorer which scores documents in/out-of order.
	Scorer(ctx index.LeafReaderContext) (Scorer, error)

	// ScorerSupplier
	// Optional method. Get a ScorerSupplier, which allows to know the cost of the Scorer before building it.
	// The default implementation calls scorer and builds a ScorerSupplier wrapper around it.
	ScorerSupplier(ctx index.LeafReaderContext) (ScorerSupplier, error)

	// BulkScorer
	// Optional method, to return a BulkScorer to score the query and send hits to a Collector.
	// Only queries that have a different top-level approach need to override this;
	// the default implementation pulls a normal Scorer and iterates and collects
	// the resulting hits which are not marked as deleted.
	//
	// context: the LeafReaderContext for which to return the Scorer.
	//
	// Returns: a BulkScorer which scores documents and passes them to a collector.
	// Throws: 	IOException – if there is a low-level I/O error
	//
	// GPT3.5:
	// 可选方法,用于返回一个BulkScorer,对查询进行评分并将结果传递给Collector。
	// 只有那些具有不同顶层方法的查询才需要覆盖此方法;默认实现获取一个普通的Scorer,
	// 迭代并收集未标记为删除的结果hits。
	//
	// 参数:
	// context - 要返回Scorer的LeafReaderContext。
	// 返回: 一个BulkScorer,对文档进行评分并将其传递给Collector。
	BulkScorer(ctx index.LeafReaderContext) (BulkScorer, error)
}

Weight Expert: Calculate query weights and build query scorers. 计算查询权重并构建查询记分器。

The purpose of Weight is to ensure searching does not modify a Query, so that a Query instance can be reused. IndexSearcher dependent state of the query should reside in the Weight. LeafReader dependent state should reside in the Scorer. 权重的目的是确保搜索不会修改查询,以便可以重用查询实例。查询的IndexSearcher依赖状态应位于权重中。LeafReader 相关状态应位于记分器中。

Since Weight creates Scorer instances for a given LeafReaderContext (scorer(LeafReaderContext)) callers must maintain the relationship between the searcher's top-level ReaderContext and the context used to create

由于权重为给定的LeafReaderContext(Scorer(LeafReaderContext))创建记分器实例,因此调用程序必须保持搜索器的顶级索引 ReaderContext和用于创建记分器的上下文之间的关系。

a Scorer. A Weight is used in the following way: A Weight is constructed by a top-level query, given a IndexSearcher (Query.createWeight(IndexSearcher, ScoreMode, float)). A Scorer is constructed by scorer(LeafReaderContext). Since: 2.9

type WeightScorer

type WeightScorer interface {
	Scorer(ctx index.LeafReaderContext) (Scorer, error)
}

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL