table

package
v2.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 22, 2019 License: Apache-2.0 Imports: 23 Imported by: 0

README

Size of table is 123,217,667 bytes for all benchmarks.

BenchmarkRead

$ go test -bench ^BenchmarkRead$ -run ^$ -count 3
goos: linux
goarch: amd64
pkg: github.com/dgraph-io/badger/table
BenchmarkRead-16    	      10	 154074944 ns/op
BenchmarkRead-16    	      10	 154340411 ns/op
BenchmarkRead-16    	      10	 151914489 ns/op
PASS
ok  	github.com/dgraph-io/badger/table	22.467s

Size of table is 123,217,667 bytes, which is ~118MB.

The rate is ~762MB/s using LoadToRAM (when table is in RAM).

To read a 64MB table, this would take ~0.084s, which is negligible.

BenchmarkReadAndBuild

$ go test -bench BenchmarkReadAndBuild -run ^$ -count 3
goos: linux
goarch: amd64
pkg: github.com/dgraph-io/badger/table
BenchmarkReadAndBuild-16    	       1	1026755231 ns/op
BenchmarkReadAndBuild-16    	       1	1009543316 ns/op
BenchmarkReadAndBuild-16    	       1	1039920546 ns/op
PASS
ok  	github.com/dgraph-io/badger/table	12.081s

The rate is ~123MB/s. To build a 64MB table, this would take ~0.56s. Note that this does NOT include the flushing of the table to disk. All we are doing above is reading one table (which is in RAM) and write one table in memory.

The table building takes 0.56-0.084s ~ 0.4823s.

BenchmarkReadMerged

Below, we merge 5 tables. The total size remains unchanged at ~122M.

$ go test -bench ReadMerged -run ^$ -count 3
goos: linux
goarch: amd64
pkg: github.com/dgraph-io/badger/table
BenchmarkReadMerged-16    	       2	 977588975 ns/op
BenchmarkReadMerged-16    	       2	 982140738 ns/op
BenchmarkReadMerged-16    	       2	 962046017 ns/op
PASS
ok  	github.com/dgraph-io/badger/table	27.433s

The rate is ~120MB/s. To read a 64MB table using merge iterator, this would take ~0.53s.

BenchmarkRandomRead

go test -bench BenchmarkRandomRead$ -run ^$ -count 3
goos: linux
goarch: amd64
pkg: github.com/dgraph-io/badger/table
BenchmarkRandomRead-16    	  500000	      2645 ns/op
BenchmarkRandomRead-16    	  500000	      2648 ns/op
BenchmarkRandomRead-16    	  500000	      2614 ns/op
PASS
ok  	github.com/dgraph-io/badger/table	50.850s

For random read benchmarking, we are randomly reading a key and verifying its value.

DB Open benchmark

  1. Create badger DB with 2 billion key-value pairs (about 380GB of data)
badger fill -m 2000 --dir="/tmp/data" --sorted
  1. Clear buffers and swap memory
free -mh && sync && echo 3 | sudo tee /proc/sys/vm/drop_caches && sudo swapoff -a && sudo swapon -a && free -mh

Also flush disk buffers

blockdev --flushbufs /dev/nvme0n1p4
  1. Run the benchmark
go test -run=^$ github.com/dgraph-io/badger -bench ^BenchmarkDBOpen$ -benchdir="/tmp/data" -v

badger 2019/06/04 17:15:56 INFO: 126 tables out of 1028 opened in 3.017s
badger 2019/06/04 17:15:59 INFO: 257 tables out of 1028 opened in 6.014s
badger 2019/06/04 17:16:02 INFO: 387 tables out of 1028 opened in 9.017s
badger 2019/06/04 17:16:05 INFO: 516 tables out of 1028 opened in 12.025s
badger 2019/06/04 17:16:08 INFO: 645 tables out of 1028 opened in 15.013s
badger 2019/06/04 17:16:11 INFO: 775 tables out of 1028 opened in 18.008s
badger 2019/06/04 17:16:14 INFO: 906 tables out of 1028 opened in 21.003s
badger 2019/06/04 17:16:17 INFO: All 1028 tables opened in 23.851s
badger 2019/06/04 17:16:17 INFO: Replaying file id: 1998 at offset: 332000
badger 2019/06/04 17:16:17 INFO: Replay took: 9.81µs
goos: linux
goarch: amd64
pkg: github.com/dgraph-io/badger
BenchmarkDBOpen-16    	       1	23930082140 ns/op
PASS
ok  	github.com/dgraph-io/badger	24.076s

It takes about 23.851s to open a DB with 2 billion sorted key-value entries.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func IDToFilename

func IDToFilename(id uint64) string

IDToFilename does the inverse of ParseFileID

func NewFilename

func NewFilename(id uint64, dir string) string

NewFilename should be named TableFilepath -- it combines the dir with the ID to make a table filepath.

func NewMergeIterator

func NewMergeIterator(iters []y.Iterator, reverse bool) y.Iterator

NewMergeIterator creates a merge iterator.

func ParseFileID

func ParseFileID(name string) (uint64, bool)

ParseFileID reads the file id out of a filename.

Types

type Builder

type Builder struct {
	// contains filtered or unexported fields
}

Builder is used in building a table.

func NewTableBuilder

func NewTableBuilder(opts Options) *Builder

NewTableBuilder makes a new TableBuilder.

func (*Builder) Add

func (b *Builder) Add(key []byte, value y.ValueStruct)

Add adds a key-value pair to the block.

func (*Builder) Close

func (b *Builder) Close()

Close closes the TableBuilder.

func (*Builder) DataKey

func (b *Builder) DataKey() *pb.DataKey

DataKey returns datakey of the builder.

func (*Builder) Empty

func (b *Builder) Empty() bool

Empty returns whether it's empty.

func (*Builder) Finish

func (b *Builder) Finish() []byte

Finish finishes the table by appending the index.

The table structure looks like +---------+------------+-----------+---------------+ | Block 1 | Block 2 | Block 3 | Block 4 | +---------+------------+-----------+---------------+ | Block 5 | Block 6 | Block ... | Block N | +---------+------------+-----------+---------------+ | Index | Index Size | Checksum | Checksum Size | +---------+------------+-----------+---------------+

In case the data is encrypted, the "IV" is added to the end of the index.

func (*Builder) ReachedCapacity

func (b *Builder) ReachedCapacity(cap int64) bool

ReachedCapacity returns true if we... roughly (?) reached capacity?

type ConcatIterator

type ConcatIterator struct {
	// contains filtered or unexported fields
}

ConcatIterator concatenates the sequences defined by several iterators. (It only works with TableIterators, probably just because it's faster to not be so generic.)

func NewConcatIterator

func NewConcatIterator(tbls []*Table, reversed bool) *ConcatIterator

NewConcatIterator creates a new concatenated iterator

func (*ConcatIterator) Close

func (s *ConcatIterator) Close() error

Close implements y.Interface.

func (*ConcatIterator) Key

func (s *ConcatIterator) Key() []byte

Key implements y.Interface

func (*ConcatIterator) Next

func (s *ConcatIterator) Next()

Next advances our concat iterator.

func (*ConcatIterator) Rewind

func (s *ConcatIterator) Rewind()

Rewind implements y.Interface

func (*ConcatIterator) Seek

func (s *ConcatIterator) Seek(key []byte)

Seek brings us to element >= key if reversed is false. Otherwise, <= key.

func (*ConcatIterator) Valid

func (s *ConcatIterator) Valid() bool

Valid implements y.Interface

func (*ConcatIterator) Value

func (s *ConcatIterator) Value() y.ValueStruct

Value implements y.Interface

type Iterator

type Iterator struct {
	// contains filtered or unexported fields
}

Iterator is an iterator for a Table.

func (*Iterator) Close

func (itr *Iterator) Close() error

Close closes the iterator (and it must be called).

func (*Iterator) Key

func (itr *Iterator) Key() []byte

Key follows the y.Iterator interface. Returns the key with timestamp.

func (*Iterator) Next

func (itr *Iterator) Next()

Next follows the y.Iterator interface

func (*Iterator) Rewind

func (itr *Iterator) Rewind()

Rewind follows the y.Iterator interface

func (*Iterator) Seek

func (itr *Iterator) Seek(key []byte)

Seek follows the y.Iterator interface

func (*Iterator) Valid

func (itr *Iterator) Valid() bool

Valid follows the y.Iterator interface

func (*Iterator) Value

func (itr *Iterator) Value() (ret y.ValueStruct)

Value follows the y.Iterator interface

func (*Iterator) ValueCopy

func (itr *Iterator) ValueCopy() (ret y.ValueStruct)

ValueCopy copies the current value and returns it as decoded ValueStruct.

type MergeIterator

type MergeIterator struct {
	// contains filtered or unexported fields
}

MergeIterator merges multiple iterators. NOTE: MergeIterator owns the array of iterators and is responsible for closing them.

func (*MergeIterator) Close

func (mi *MergeIterator) Close() error

Close implements y.Iterator.

func (*MergeIterator) Key

func (mi *MergeIterator) Key() []byte

Key returns the key associated with the current iterator.

func (*MergeIterator) Next

func (mi *MergeIterator) Next()

Next returns the next element. If it is the same as the current key, ignore it.

func (*MergeIterator) Rewind

func (mi *MergeIterator) Rewind()

Rewind seeks to first element (or last element for reverse iterator).

func (*MergeIterator) Seek

func (mi *MergeIterator) Seek(key []byte)

Seek brings us to element with key >= given key.

func (*MergeIterator) Valid

func (mi *MergeIterator) Valid() bool

Valid returns whether the MergeIterator is at a valid element.

func (*MergeIterator) Value

func (mi *MergeIterator) Value() y.ValueStruct

Value returns the value associated with the iterator.

type Options

type Options struct {

	// ChkMode is the checksum verification mode for Table.
	ChkMode options.ChecksumVerificationMode

	// LoadingMode is the mode to be used for loading Table.
	LoadingMode options.FileLoadingMode

	// BloomFalsePositive is the false positive probabiltiy of bloom filter.
	BloomFalsePositive float64

	// BlockSize is the size of each block inside SSTable in bytes.
	BlockSize int

	// DataKey is the key used to decrypt the encrypted text.
	DataKey *pb.DataKey

	// Compression indicates the compression algorithm used for block compression.
	Compression options.CompressionType

	Cache *ristretto.Cache
}

Options contains configurable options for Table/Builder.

type Table

type Table struct {
	sync.Mutex

	Checksum []byte

	IsInmemory bool // Set to true if the table is on level 0 and opened in memory.
	// contains filtered or unexported fields
}

Table represents a loaded table file with the info we have about it

func OpenInMemoryTable

func OpenInMemoryTable(data []byte, id uint64, opt *Options) (*Table, error)

OpenInMemoryTable is similar to OpenTable but it opens a new table from the provided data. OpenInMemoryTable is used for L0 tables.

func OpenTable

func OpenTable(fd *os.File, opts Options) (*Table, error)

OpenTable assumes file has only one table and opens it. Takes ownership of fd upon function entry. Returns a table with one reference count on it (decrementing which may delete the file! -- consider t.Close() instead). The fd has to writeable because we call Truncate on it before deleting. Checksum for all blocks of table is verified based on value of chkMode.

func (*Table) Biggest

func (t *Table) Biggest() []byte

Biggest is its biggest key, or nil if there are none

func (*Table) Close

func (t *Table) Close() error

Close closes the open table. (Releases resources back to the OS.)

func (*Table) CompressionType

func (t *Table) CompressionType() options.CompressionType

CompressionType returns the compression algorithm used for block compression.

func (*Table) DecrRef

func (t *Table) DecrRef() error

DecrRef decrements the refcount and possibly deletes the table

func (*Table) DoesNotHave

func (t *Table) DoesNotHave(hash uint64) bool

DoesNotHave returns true if (but not "only if") the table does not have the key hash. It does a bloom filter lookup.

func (*Table) Filename

func (t *Table) Filename() string

Filename is NOT the file name. Just kidding, it is.

func (*Table) ID

func (t *Table) ID() uint64

ID is the table's ID number (used to make the file name).

func (*Table) IncrRef

func (t *Table) IncrRef()

IncrRef increments the refcount (having to do with whether the file should be deleted)

func (*Table) KeyID

func (t *Table) KeyID() uint64

KeyID returns data key id.

func (*Table) NewIterator

func (t *Table) NewIterator(reversed bool) *Iterator

NewIterator returns a new iterator of the Table

func (*Table) Size

func (t *Table) Size() int64

Size is its file size in bytes

func (*Table) Smallest

func (t *Table) Smallest() []byte

Smallest is its smallest key, or nil if there are none

func (*Table) VerifyChecksum

func (t *Table) VerifyChecksum() error

VerifyChecksum verifies checksum for all blocks of table. This function is called by OpenTable() function. This function is also called inside levelsController.VerifyChecksum().

type TableInterface

type TableInterface interface {
	Smallest() []byte
	Biggest() []byte
	DoesNotHave(hash uint64) bool
}

TableInterface is useful for testing.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL