lucene40

package
v0.0.0-...-309f818 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 2, 2021 License: Apache-2.0 Imports: 8 Imported by: 0

Documentation

Index

Constants

View Source
const (
	CODEC = "BitVector"

	/* Change DGaps to encode gaps between cleared bits, not set: */
	BV_VERSION_DGAPS_CLEARED = 1

	BV_VERSION_CHECKSUM = 2

	/* Imcrement version to change it: */
	BV_VERSION_CURRENT = BV_VERSION_CHECKSUM
)
View Source
const (
	LUCENE40_SI_EXTENSION    = "si"
	LUCENE40_CODEC_NAME      = "Lucene40SegmentInfo"
	LUCENE40_VERSION_START   = 0
	LUCENE40_VERSION_CURRENT = LUCENE40_VERSION_START
)
View Source
const (
	SEGMENT_INFO_NO  = -1
	SEGMENT_INFO_YES = 1
)
View Source
const (
	FIELDS_EXTENSION       = "fdt"
	FIELDS_INDEX_EXTENSION = "fdx"
)

Lucene40StoredFieldsWriter.java

View Source
const DELETES_EXTENSION = "del"

Extension of deletes

Variables

This section is empty.

Functions

This section is empty.

Types

type BitVector

type BitVector struct {
	// contains filtered or unexported fields
}

func NewBitVector

func NewBitVector(n int) *BitVector

func (*BitVector) At

func (bv *BitVector) At(bit int) bool

func (*BitVector) Clear

func (bv *BitVector) Clear(bit int)

func (*BitVector) Count

func (bv *BitVector) Count() int

Returns the total number of bits in this vector. This is efficiently computed and cached, so that, if the vector is not changed, no recomputation is done for repeated calls.

func (*BitVector) InvertAll

func (bv *BitVector) InvertAll()

Invert all bits

func (*BitVector) Length

func (bv *BitVector) Length() int

func (*BitVector) Write

func (bv *BitVector) Write(d store.Directory, name string, ctx store.IOContext) (err error)

Writes this vector to the file name in Directory d, in a format that can be read by the constructor BitVector(Directory, String, IOContext)

type Lucene40LiveDocsFormat

type Lucene40LiveDocsFormat struct {
}

Lucene 4.0 Live Documents Format.

The .del file is optional, and only exists when a segment contains deletions.

Although per-segment, this file is maintained exterior to compound segment files.

Deletions (.del) --> Format,Heaer,ByteCount,BitCount, Bits | DGaps

  (depending on Format)
	Format,ByteSize,BitCount --> uint32
	Bits --> <byte>^ByteCount
	DGaps --> <DGap,NonOnesByte>^NonzeroBytesCount
	DGap --> vint
	NonOnesByte --> byte
	Header --> CodecHeader

Format is 1: indicates cleard DGaps.

ByteCount indicates the number of bytes in Bits. It is typically (SegSize/8)+1.

BitCount indicates the number of bits that are currently set in Bits.

Bits contains one bit for each document indexed. When the bit corresponding to a document number is cleared, that document is marked as deleted. Bit ordering is from least to most significant. Thus, if Bits contains two bytes, 0x00 and 0x02, then document 9 is marked as alive (not deleted).

DGaps represents sparse bit-vectors more efficiently than Bits. It is makde of DGaps on indexes of nonOnes bytes in Bits, and the nonOnes bytes themselves. The number of nonOnes byte in Bits (NonOnesBytesCount) is not stored.

For example, if there are 8000 bits and only bits 10,12,32 are cleared, DGaps would be used:

(vint) 1, (byte) 20, (vint) 3, (byte) 1

func (*Lucene40LiveDocsFormat) Files

func (format *Lucene40LiveDocsFormat) Files(info *SegmentCommitInfo) []string

func (*Lucene40LiveDocsFormat) NewLiveDocs

func (format *Lucene40LiveDocsFormat) NewLiveDocs(size int) util.MutableBits

func (*Lucene40LiveDocsFormat) WriteLiveDocs

func (format *Lucene40LiveDocsFormat) WriteLiveDocs(bits util.MutableBits,
	dir store.Directory, info *SegmentCommitInfo, newDelCount int,
	ctx store.IOContext) error

type Lucene40SegmentInfoFormat

type Lucene40SegmentInfoFormat struct {
	// contains filtered or unexported fields
}

Lucene 4.0 Segment info format.

Files: - .si: Header, SegVersion, SegSize, IsCompoundFile, Diagnostics, Attributes, Files

Data types: - Header --> CodecHeader - SegSize --> int32 - SegVersion --> string - Files --> set[string] - Diagnostics, Attributes --> map[string]string - IsCompoundFile --> byte

Field Descriptions:

  • SegVersion is the code version that created the segment.
  • SegSize is the number of documents contained in the segment index.
  • IsCompoundFile records whether the segment is written as a compound file or not. If this is -1, the segment is not a compound file. If it is 1, the segment is a compound file.
  • Checksum contains the CRC32 checksum of all bytes in the segments_N file up until the checksum. This is used to verify integrity of the file on opening the index.
  • The Diagnostics Map is privately written by IndexWriter, as a debugging for each segment it creates. It includes metadata like the current Lucene version, OS, Java version, why the segment was created (merge, flush, addIndexes), etc.
  • Attributes: a key-value map of codec-pivate attributes.
  • Files is a list of files referred to by this segment.

func NewLucene40SegmentInfoFormat

func NewLucene40SegmentInfoFormat() *Lucene40SegmentInfoFormat

func (*Lucene40SegmentInfoFormat) SegmentInfoReader

func (f *Lucene40SegmentInfoFormat) SegmentInfoReader() SegmentInfoReader

func (*Lucene40SegmentInfoFormat) SegmentInfoWriter

func (f *Lucene40SegmentInfoFormat) SegmentInfoWriter() SegmentInfoWriter

type Lucene40SegmentInfoReader

type Lucene40SegmentInfoReader struct{}

func (*Lucene40SegmentInfoReader) Read

func (r *Lucene40SegmentInfoReader) Read(dir store.Directory,
	segment string, context store.IOContext) (si *SegmentInfo, err error)

type Lucene40SegmentInfoWriter

type Lucene40SegmentInfoWriter struct{}

func (*Lucene40SegmentInfoWriter) Write

func (w *Lucene40SegmentInfoWriter) Write(dir store.Directory,
	si *SegmentInfo, fis FieldInfos, ctx store.IOContext) (err error)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL