Documentation
¶
Index ¶
Constants ¶
const ( CODEC = "BitVector" /* Change DGaps to encode gaps between cleared bits, not set: */ BV_VERSION_DGAPS_CLEARED = 1 BV_VERSION_CHECKSUM = 2 /* Imcrement version to change it: */ BV_VERSION_CURRENT = BV_VERSION_CHECKSUM )
const ( LUCENE40_SI_EXTENSION = "si" LUCENE40_CODEC_NAME = "Lucene40SegmentInfo" LUCENE40_VERSION_START = 0 LUCENE40_VERSION_CURRENT = LUCENE40_VERSION_START )
const ( SEGMENT_INFO_NO = -1 SEGMENT_INFO_YES = 1 )
const ( FIELDS_EXTENSION = "fdt" FIELDS_INDEX_EXTENSION = "fdx" )
Lucene40StoredFieldsWriter.java
const DELETES_EXTENSION = "del"
Extension of deletes
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type BitVector ¶
type BitVector struct {
// contains filtered or unexported fields
}
func NewBitVector ¶
type Lucene40LiveDocsFormat ¶
type Lucene40LiveDocsFormat struct { }
Lucene 4.0 Live Documents Format.
The .del file is optional, and only exists when a segment contains deletions.
Although per-segment, this file is maintained exterior to compound segment files.
Deletions (.del) --> Format,Heaer,ByteCount,BitCount, Bits | DGaps
(depending on Format) Format,ByteSize,BitCount --> uint32 Bits --> <byte>^ByteCount DGaps --> <DGap,NonOnesByte>^NonzeroBytesCount DGap --> vint NonOnesByte --> byte Header --> CodecHeader
Format is 1: indicates cleard DGaps.
ByteCount indicates the number of bytes in Bits. It is typically (SegSize/8)+1.
BitCount indicates the number of bits that are currently set in Bits.
Bits contains one bit for each document indexed. When the bit corresponding to a document number is cleared, that document is marked as deleted. Bit ordering is from least to most significant. Thus, if Bits contains two bytes, 0x00 and 0x02, then document 9 is marked as alive (not deleted).
DGaps represents sparse bit-vectors more efficiently than Bits. It is makde of DGaps on indexes of nonOnes bytes in Bits, and the nonOnes bytes themselves. The number of nonOnes byte in Bits (NonOnesBytesCount) is not stored.
For example, if there are 8000 bits and only bits 10,12,32 are cleared, DGaps would be used:
(vint) 1, (byte) 20, (vint) 3, (byte) 1
func (*Lucene40LiveDocsFormat) Files ¶
func (format *Lucene40LiveDocsFormat) Files(info *SegmentCommitInfo) []string
func (*Lucene40LiveDocsFormat) NewLiveDocs ¶
func (format *Lucene40LiveDocsFormat) NewLiveDocs(size int) util.MutableBits
func (*Lucene40LiveDocsFormat) WriteLiveDocs ¶
func (format *Lucene40LiveDocsFormat) WriteLiveDocs(bits util.MutableBits, dir store.Directory, info *SegmentCommitInfo, newDelCount int, ctx store.IOContext) error
type Lucene40SegmentInfoFormat ¶
type Lucene40SegmentInfoFormat struct {
// contains filtered or unexported fields
}
Lucene 4.0 Segment info format.
Files: - .si: Header, SegVersion, SegSize, IsCompoundFile, Diagnostics, Attributes, Files
Data types: - Header --> CodecHeader - SegSize --> int32 - SegVersion --> string - Files --> set[string] - Diagnostics, Attributes --> map[string]string - IsCompoundFile --> byte
Field Descriptions:
- SegVersion is the code version that created the segment.
- SegSize is the number of documents contained in the segment index.
- IsCompoundFile records whether the segment is written as a compound file or not. If this is -1, the segment is not a compound file. If it is 1, the segment is a compound file.
- Checksum contains the CRC32 checksum of all bytes in the segments_N file up until the checksum. This is used to verify integrity of the file on opening the index.
- The Diagnostics Map is privately written by IndexWriter, as a debugging for each segment it creates. It includes metadata like the current Lucene version, OS, Java version, why the segment was created (merge, flush, addIndexes), etc.
- Attributes: a key-value map of codec-pivate attributes.
- Files is a list of files referred to by this segment.
func NewLucene40SegmentInfoFormat ¶
func NewLucene40SegmentInfoFormat() *Lucene40SegmentInfoFormat
func (*Lucene40SegmentInfoFormat) SegmentInfoReader ¶
func (f *Lucene40SegmentInfoFormat) SegmentInfoReader() SegmentInfoReader
func (*Lucene40SegmentInfoFormat) SegmentInfoWriter ¶
func (f *Lucene40SegmentInfoFormat) SegmentInfoWriter() SegmentInfoWriter
type Lucene40SegmentInfoReader ¶
type Lucene40SegmentInfoReader struct{}
type Lucene40SegmentInfoWriter ¶
type Lucene40SegmentInfoWriter struct{}