README
¶
Size of table is 123,217,667 bytes for all benchmarks.
BenchmarkRead
$ go test -bench ^BenchmarkRead$ -run ^$ -count 3
goos: linux
goarch: amd64
pkg: github.com/dgraph-io/badger/table
BenchmarkRead-16 10 154074944 ns/op
BenchmarkRead-16 10 154340411 ns/op
BenchmarkRead-16 10 151914489 ns/op
PASS
ok github.com/dgraph-io/badger/table 22.467s
Size of table is 123,217,667 bytes, which is ~118MB.
The rate is ~762MB/s using LoadToRAM (when table is in RAM).
To read a 64MB table, this would take ~0.084s, which is negligible.
BenchmarkReadAndBuild
$ go test -bench BenchmarkReadAndBuild -run ^$ -count 3
goos: linux
goarch: amd64
pkg: github.com/dgraph-io/badger/table
BenchmarkReadAndBuild-16 1 1026755231 ns/op
BenchmarkReadAndBuild-16 1 1009543316 ns/op
BenchmarkReadAndBuild-16 1 1039920546 ns/op
PASS
ok github.com/dgraph-io/badger/table 12.081s
The rate is ~123MB/s. To build a 64MB table, this would take ~0.56s. Note that this does NOT include the flushing of the table to disk. All we are doing above is reading one table (which is in RAM) and write one table in memory.
The table building takes 0.56-0.084s ~ 0.4823s.
BenchmarkReadMerged
Below, we merge 5 tables. The total size remains unchanged at ~122M.
$ go test -bench ReadMerged -run ^$ -count 3
goos: linux
goarch: amd64
pkg: github.com/dgraph-io/badger/table
BenchmarkReadMerged-16 2 977588975 ns/op
BenchmarkReadMerged-16 2 982140738 ns/op
BenchmarkReadMerged-16 2 962046017 ns/op
PASS
ok github.com/dgraph-io/badger/table 27.433s
The rate is ~120MB/s. To read a 64MB table using merge iterator, this would take ~0.53s.
BenchmarkRandomRead
go test -bench BenchmarkRandomRead$ -run ^$ -count 3
goos: linux
goarch: amd64
pkg: github.com/dgraph-io/badger/table
BenchmarkRandomRead-16 500000 2645 ns/op
BenchmarkRandomRead-16 500000 2648 ns/op
BenchmarkRandomRead-16 500000 2614 ns/op
PASS
ok github.com/dgraph-io/badger/table 50.850s
For random read benchmarking, we are randomly reading a key and verifying its value.
DB Open benchmark
- Create badger DB with 2 billion key-value pairs (about 380GB of data)
badger fill -m 2000 --dir="/tmp/data" --sorted
- Clear buffers and swap memory
free -mh && sync && echo 3 | sudo tee /proc/sys/vm/drop_caches && sudo swapoff -a && sudo swapon -a && free -mh
Also flush disk buffers
blockdev --flushbufs /dev/nvme0n1p4
- Run the benchmark
go test -run=^$ github.com/dgraph-io/badger -bench ^BenchmarkDBOpen$ -benchdir="/tmp/data" -v
badger 2019/06/04 17:15:56 INFO: 126 tables out of 1028 opened in 3.017s
badger 2019/06/04 17:15:59 INFO: 257 tables out of 1028 opened in 6.014s
badger 2019/06/04 17:16:02 INFO: 387 tables out of 1028 opened in 9.017s
badger 2019/06/04 17:16:05 INFO: 516 tables out of 1028 opened in 12.025s
badger 2019/06/04 17:16:08 INFO: 645 tables out of 1028 opened in 15.013s
badger 2019/06/04 17:16:11 INFO: 775 tables out of 1028 opened in 18.008s
badger 2019/06/04 17:16:14 INFO: 906 tables out of 1028 opened in 21.003s
badger 2019/06/04 17:16:17 INFO: All 1028 tables opened in 23.851s
badger 2019/06/04 17:16:17 INFO: Replaying file id: 1998 at offset: 332000
badger 2019/06/04 17:16:17 INFO: Replay took: 9.81µs
goos: linux
goarch: amd64
pkg: github.com/dgraph-io/badger
BenchmarkDBOpen-16 1 23930082140 ns/op
PASS
ok github.com/dgraph-io/badger 24.076s
It takes about 23.851s to open a DB with 2 billion sorted key-value entries.
Documentation
¶
Index ¶
- Constants
- Variables
- func BlockEvictHandler(value interface{})
- func IDToFilename(id uint64) string
- func NewFilename(id uint64, dir string) string
- func NewMergeIterator(iters []y.Iterator, reverse bool) y.Iterator
- func ParseFileID(name string) (uint64, bool)
- type Builder
- func (b *Builder) Add(key []byte, value y.ValueStruct, valueLen uint32)
- func (b *Builder) AddStaleKey(key []byte, v y.ValueStruct, valueLen uint32)
- func (b *Builder) Close()
- func (b *Builder) DataKey() *pb.DataKey
- func (b *Builder) Done() buildData
- func (b *Builder) Empty() bool
- func (b *Builder) Finish() []byte
- func (b *Builder) Opts() *Options
- func (b *Builder) ReachedCapacity() bool
- type ConcatIterator
- type Iterator
- func (itr *Iterator) Close() error
- func (itr *Iterator) Key() []byte
- func (itr *Iterator) Next()
- func (itr *Iterator) Rewind()
- func (itr *Iterator) Seek(key []byte)
- func (itr *Iterator) Valid() bool
- func (itr *Iterator) Value() (ret y.ValueStruct)
- func (itr *Iterator) ValueCopy() (ret y.ValueStruct)
- type MergeIterator
- type Options
- type Table
- func (t *Table) Biggest() []byte
- func (t *Table) BloomFilterSize() int
- func (t *Table) CompressionType() options.CompressionType
- func (t *Table) DecrRef() error
- func (t *Table) DoesNotHave(hash uint32) bool
- func (t *Table) Filename() string
- func (t *Table) ID() uint64
- func (t *Table) IncrRef()
- func (t *Table) IndexSize() int
- func (t *Table) KeyCount() uint32
- func (t *Table) KeyID() uint64
- func (t *Table) KeySplits(n int, prefix []byte) []string
- func (t *Table) MaxVersion() uint64
- func (t *Table) NewIterator(opt int) *Iterator
- func (t *Table) OnDiskSize() uint32
- func (t *Table) Size() int64
- func (t *Table) Smallest() []byte
- func (t *Table) StaleDataSize() uint32
- func (t *Table) UncompressedSize() uint32
- func (t *Table) VerifyChecksum() error
- type TableInterface
Constants ¶
const ( KB = 1024 MB = KB * 1024 )
Variables ¶
var ( REVERSED int = 2 NOCACHE int = 4 )
var NumBlocks int32
Functions ¶
func BlockEvictHandler ¶
func BlockEvictHandler(value interface{})
BlockEvictHandler is used to reuse the byte slice stored in the block on cache eviction.
func IDToFilename ¶
IDToFilename does the inverse of ParseFileID
func NewFilename ¶
NewFilename should be named TableFilepath -- it combines the dir with the ID to make a table filepath.
func NewMergeIterator ¶
NewMergeIterator creates a merge iterator.
func ParseFileID ¶
ParseFileID reads the file id out of a filename.
Types ¶
type Builder ¶
type Builder struct {
// contains filtered or unexported fields
}
Builder is used in building a table.
func NewTableBuilder ¶
NewTableBuilder makes a new TableBuilder.
func (*Builder) Add ¶
func (b *Builder) Add(key []byte, value y.ValueStruct, valueLen uint32)
Add adds a key-value pair to the block.
func (*Builder) AddStaleKey ¶ added in v3.2103.0
func (b *Builder) AddStaleKey(key []byte, v y.ValueStruct, valueLen uint32)
AddStaleKey is same is Add function but it also increments the internal staleDataSize counter. This value will be used to prioritize this table for compaction.
func (*Builder) Finish ¶
Finish finishes the table by appending the index.
The table structure looks like +---------+------------+-----------+---------------+ | Block 1 | Block 2 | Block 3 | Block 4 | +---------+------------+-----------+---------------+ | Block 5 | Block 6 | Block ... | Block N | +---------+------------+-----------+---------------+ | Index | Index Size | Checksum | Checksum Size | +---------+------------+-----------+---------------+
In case the data is encrypted, the "IV" is added to the end of the index.
func (*Builder) ReachedCapacity ¶
ReachedCapacity returns true if we... roughly (?) reached capacity?
type ConcatIterator ¶
type ConcatIterator struct {
// contains filtered or unexported fields
}
ConcatIterator concatenates the sequences defined by several iterators. (It only works with TableIterators, probably just because it's faster to not be so generic.)
func NewConcatIterator ¶
func NewConcatIterator(tbls []*Table, opt int) *ConcatIterator
NewConcatIterator creates a new concatenated iterator
func (*ConcatIterator) Seek ¶
func (s *ConcatIterator) Seek(key []byte)
Seek brings us to element >= key if reversed is false. Otherwise, <= key.
func (*ConcatIterator) Value ¶
func (s *ConcatIterator) Value() y.ValueStruct
Value implements y.Interface
type Iterator ¶
type Iterator struct {
// contains filtered or unexported fields
}
Iterator is an iterator for a Table.
func (*Iterator) Value ¶
func (itr *Iterator) Value() (ret y.ValueStruct)
Value follows the y.Iterator interface
func (*Iterator) ValueCopy ¶
func (itr *Iterator) ValueCopy() (ret y.ValueStruct)
ValueCopy copies the current value and returns it as decoded ValueStruct.
type MergeIterator ¶
type MergeIterator struct {
// contains filtered or unexported fields
}
MergeIterator merges multiple iterators. NOTE: MergeIterator owns the array of iterators and is responsible for closing them.
func (*MergeIterator) Key ¶
func (mi *MergeIterator) Key() []byte
Key returns the key associated with the current iterator.
func (*MergeIterator) Next ¶
func (mi *MergeIterator) Next()
Next returns the next element. If it is the same as the current key, ignore it.
func (*MergeIterator) Rewind ¶
func (mi *MergeIterator) Rewind()
Rewind seeks to first element (or last element for reverse iterator).
func (*MergeIterator) Seek ¶
func (mi *MergeIterator) Seek(key []byte)
Seek brings us to element with key >= given key.
func (*MergeIterator) Valid ¶
func (mi *MergeIterator) Valid() bool
Valid returns whether the MergeIterator is at a valid element.
func (*MergeIterator) Value ¶
func (mi *MergeIterator) Value() y.ValueStruct
Value returns the value associated with the iterator.
type Options ¶
type Options struct { // Open tables in read only mode. ReadOnly bool MetricsEnabled bool // Maximum size of the table. TableSize uint64 // ChkMode is the checksum verification mode for Table. ChkMode options.ChecksumVerificationMode // BloomFalsePositive is the false positive probabiltiy of bloom filter. BloomFalsePositive float64 // BlockSize is the size of each block inside SSTable in bytes. BlockSize int // DataKey is the key used to decrypt the encrypted text. DataKey *pb.DataKey // Compression indicates the compression algorithm used for block compression. Compression options.CompressionType // Block cache is used to cache decompressed and decrypted blocks. BlockCache *ristretto.Cache IndexCache *ristretto.Cache AllocPool *z.AllocatorPool // ZSTDCompressionLevel is the ZSTD compression level used for compressing blocks. ZSTDCompressionLevel int // contains filtered or unexported fields }
Options contains configurable options for Table/Builder.
type Table ¶
type Table struct { sync.Mutex *z.MmapFile Checksum []byte CreatedAt time.Time IsInmemory bool // Set to true if the table is on level 0 and opened in memory. // contains filtered or unexported fields }
Table represents a loaded table file with the info we have about it.
func OpenInMemoryTable ¶
OpenInMemoryTable is similar to OpenTable but it opens a new table from the provided data. OpenInMemoryTable is used for L0 tables.
func OpenTable ¶
OpenTable assumes file has only one table and opens it. Takes ownership of fd upon function entry. Returns a table with one reference count on it (decrementing which may delete the file! -- consider t.Close() instead). The fd has to writeable because we call Truncate on it before deleting. Checksum for all blocks of table is verified based on value of chkMode.
func (*Table) BloomFilterSize ¶
BloomFilterSize returns the size of the bloom filter in bytes stored in memory.
func (*Table) CompressionType ¶
func (t *Table) CompressionType() options.CompressionType
CompressionType returns the compression algorithm used for block compression.
func (*Table) DoesNotHave ¶
DoesNotHave returns true if and only if the table does not have the key hash. It does a bloom filter lookup.
func (*Table) IncrRef ¶
func (t *Table) IncrRef()
IncrRef increments the refcount (having to do with whether the file should be deleted)
func (*Table) KeySplits ¶
KeySplits splits the table into at least n ranges based on the block offsets.
func (*Table) MaxVersion ¶
MaxVersion returns the maximum version across all keys stored in this table.
func (*Table) NewIterator ¶
NewIterator returns a new iterator of the Table
func (*Table) OnDiskSize ¶
OnDiskSize returns the total size of key-values stored in this table (including the disk space occupied on the value log).
func (*Table) StaleDataSize ¶ added in v3.2103.0
StaleDataSize is the amount of stale data (that can be dropped by a compaction )in this SST.
func (*Table) UncompressedSize ¶
UncompressedSize is the size uncompressed data stored in this file.
func (*Table) VerifyChecksum ¶
VerifyChecksum verifies checksum for all blocks of table. This function is called by OpenTable() function. This function is also called inside levelsController.VerifyChecksum().