index

package
v0.2.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 27, 2022 License: Apache-2.0, MIT Imports: 14 Imported by: 0

Documentation

Index

Constants

View Source
const BucketPrefixSize int = 4

BucketPrefixSize is how many bytes of bucket prefixes are stored.

View Source
const FileOffsetBytes int = 8

FileOffsetBytes is the byte size of the file offset

View Source
const FileSizeBytes int = 4

FileSizeBytes is the byte size of the file size

View Source
const (
	// IndexVersion is stored in the header data to indicate how to interpret
	// index data.
	IndexVersion = 3
)
View Source
const KeySizeBytes int = 1

KeySizeBytes is key length slot, a one byte prefix

Variables

This section is empty.

Functions

func AddKeyPosition

func AddKeyPosition(data []byte, keyPos KeyPositionPair) []byte

AddKeyPosition extends record data with an encoded key and a file offset.

The format is:

```text

|         8 bytes        |      1 byte     | Variable size < 256 bytes |
| Pointer to actual data | Size of the key |            Key            |

```

func EncodeKeyPosition

func EncodeKeyPosition(keyPos KeyPositionPair) []byte

EncodeKeyPosition a key and and offset into a single record

func RemoveSavedBuckets added in v0.2.0

func RemoveSavedBuckets(basePath string) error

Types

type BucketIndex

type BucketIndex uint32

BucketIndex is an index to a bucket

func ReadBucketPrefix

func ReadBucketPrefix(reader io.Reader) (BucketIndex, error)

ReadBucketPrefix reads the bucket prefix and returns it.

type Buckets

type Buckets []types.Position

Buckets contains pointers to file offsets

The generic specifies how many bits are used to create the buckets. The number of buckets is 2 ^ bits.

func NewBuckets

func NewBuckets(indexSizeBits uint8) (Buckets, error)

NewBuckets returns a list of buckets for the given index size in bits

func (Buckets) Get

func (b Buckets) Get(index BucketIndex) (types.Position, error)

Get updates returns the value at the given index

func (Buckets) Put

func (b Buckets) Put(index BucketIndex, offset types.Position) error

Put updates a bucket value

type Header struct {
	// A version number in case we change the header
	Version int
	// The number of bits used to determine the in-memory buckets
	BucketsBits byte
	// MaxFileSize is the size limit of each index file. This cannot be greater
	// than 4GiB.
	MaxFileSize uint32
	// First index file number
	FirstFile uint32
}

Header contains information about the index. This is actually stored in a separate ".info" file, but is the first file read when the index is opened.

type Index

type Index struct {
	Primary primary.PrimaryStorage
	// contains filtered or unexported fields
}

func Open added in v0.2.0

func Open(ctx context.Context, path string, primary primary.PrimaryStorage, indexSizeBits uint8, maxFileSize uint32, gcInterval, gcTimeLimit time.Duration, gcScanUnused bool) (*Index, error)

Open opens the index for the given primary. The index is created if there is no existing index at the specified path. If there is an older version index, then it is automatically upgraded.

Specifying 0 for indexSizeBits and maxFileSize results in using their default values. A gcInterval of 0 disables garbage collection.

func (*Index) Close

func (idx *Index) Close() error

Close calls Flush to write work and data to the current index file, and then closes the file.

func (*Index) Flush

func (idx *Index) Flush() (types.Work, error)

Flush writes outstanding work and buffered data to the current index file and updates buckets.

func (*Index) Get

func (idx *Index) Get(key []byte) (types.Block, bool, error)

Get the file offset in the primary storage of a key.

func (*Index) OutstandingWork

func (i *Index) OutstandingWork() types.Work

func (*Index) Put

func (idx *Index) Put(key []byte, location types.Block) error

Put puts a key together with a file offset into the index.

The key needs to be a cryptographically secure hash that is at least 4 bytes long.

func (*Index) Remove

func (idx *Index) Remove(key []byte) (bool, error)

Remove removes a key from the index.

func (*Index) StorageSize added in v0.1.0

func (idx *Index) StorageSize() (int64, error)

StorageSize returns bytes of storage used by the index files.

func (*Index) Sync

func (idx *Index) Sync() error

Sync commits the contents of the current index file to disk. Flush should be called before calling Sync.

func (*Index) Update

func (idx *Index) Update(key []byte, location types.Block) error

Update updates a key together with a file offset into the index.

type IndexIter

type IndexIter struct {
	// contains filtered or unexported fields
}

An iterator over index entries.

On each iteration it returns the position of the record within the index together with the raw record list data.

func NewIndexIter

func NewIndexIter(basePath string, fileNum uint32) *IndexIter

func (*IndexIter) Close added in v0.1.0

func (iter *IndexIter) Close() error

func (*IndexIter) Next

func (iter *IndexIter) Next() ([]byte, types.Position, bool, error)

type KeyPositionPair

type KeyPositionPair struct {
	Key []byte
	// The file offset, into the primary file, where the full key and its value
	// is actually stored.
	Block types.Block
}

KeyPositionPair contains a key, which is the unique prefix of the actual key, and the value which is a file offset.

type Record

type Record struct {
	// The current position (in bytes) of the record within the [`RecordList`]
	Pos int
	KeyPositionPair
}

Record is a KeyPositionPair plus the actual position of the record in the record list

func (*Record) NextPos

func (r *Record) NextPos() int

NextPos returns the position of the next record.

type RecordList

type RecordList []byte

RecordList is the main object that contains several [`Record`]s. Records can be stored and retrieved.

The underlying data is a continuous range of bytes. The format is:

```text

|                  Once                  |      Repeated     |
|                                        |                   |
|                 4 bytes                | Variable size | … |
| Bit value used to determine the bucket |     Record    | … |

```

func NewRecordList

func NewRecordList(data []byte) RecordList

NewRecordList returns an iterable RecordList from the given byte array

func NewRecordListRaw

func NewRecordListRaw(data []byte) RecordList

NewRecordList returns an iterable RecordList from the given byte array

func (RecordList) Empty

func (rl RecordList) Empty() bool

Empty eturns true if the record list is empty.

func (RecordList) FindKeyPosition

func (rl RecordList) FindKeyPosition(key []byte) (pos int, prev Record, hasPrev bool)

FindKeyPosition return the position where a key would be added.

Returns the position together with the previous record.

func (RecordList) Get

func (rl RecordList) Get(key []byte) (types.Block, bool)

Get the primary storage file offset for that key.

As the index is only storing prefixes and not the actual keys, the returned offset might match, it's not guaranteed. Once the key is retieved from the primary storage it needs to be checked if it actually matches.

func (RecordList) GetRecord

func (rl RecordList) GetRecord(key []byte) *Record

GetRecord returns the full record for a key in the recordList

func (RecordList) Iter

func (rl RecordList) Iter() *RecordListIter

Iter returns an iterator for a record list

func (RecordList) Len

func (rl RecordList) Len() int

Len returns the byte length of the record list.

func (RecordList) PutKeys

func (rl RecordList) PutKeys(keys []KeyPositionPair, start int, end int) []byte

PutKeys puts keys at a certain position and returns the new data

This method puts a continuous range of keys inside the data structure. The given range is where it is put. *This means that you can also overwrite existing keys.*

This is needed if you insert a new key that fully contains an existing key. The existing key needs to replaced by one with a larger prefix, so that it is distinguishable from the new key.

func (RecordList) ReadRecord

func (rl RecordList) ReadRecord(pos int) Record

ReadRecord reads a record from a slice at the givem position.

The given position must point to the first byte where the record starts.

type RecordListIter

type RecordListIter struct {
	// contains filtered or unexported fields
}

RecordListIter provides an easy mechanism to iterate a record list

func (*RecordListIter) Done

func (rli *RecordListIter) Done() bool

Done indicates whether there are more records to read

func (*RecordListIter) Next

func (rli *RecordListIter) Next() Record

Next returns the next record in the list

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL