README ¶
BenchmarkRead
$ go test -bench Read$ -count 3
Size of table: 105843444
BenchmarkRead-8 3 343846914 ns/op
BenchmarkRead-8 3 351790907 ns/op
BenchmarkRead-8 3 351762823 ns/op
Size of table is 105,843,444 bytes, which is ~101M.
The rate is ~287M/s which matches our read speed. This is using mmap.
To read a 64M table, this would take ~0.22s, which is negligible.
$ go test -bench BenchmarkReadAndBuild -count 3
BenchmarkReadAndBuild-8 1 2341034225 ns/op
BenchmarkReadAndBuild-8 1 2346349671 ns/op
BenchmarkReadAndBuild-8 1 2364064576 ns/op
The rate is ~43M/s. To build a ~64M table, this would take ~1.5s. Note that this does NOT include the flushing of the table to disk. All we are doing above is to read one table (mmaped) and write one table in memory.
The table building takes 1.5-0.22 ~ 1.3s.
If we are writing out up to 10 tables, this would take 1.5*10 ~ 15s, and ~13s is spent building the tables.
When running populate, building one table in memory tends to take ~1.5s to ~2.5s on my system. Where does this overhead come from? Let's investigate the merging.
Below, we merge 5 tables. The total size remains unchanged at ~101M.
$ go test -bench ReadMerged -count 3
BenchmarkReadMerged-8 1 1321190264 ns/op
BenchmarkReadMerged-8 1 1296958737 ns/op
BenchmarkReadMerged-8 1 1314381178 ns/op
The rate is ~76M/s. To build a 64M table, this would take ~0.84s. The writing takes ~1.3s as we saw above. So in total, we expect around 0.84+1.3 ~ 2.1s. This roughly matches what we observe when running populate. There might be some additional overhead due to the concurrent writes going on, in flushing the table to disk. Also, the tables tend to be slightly bigger than 64M/s.
Documentation ¶
Index ¶
- func IDToFilename(id uint64) string
- func NewFilename(id uint64, dir string) string
- func ParseFileID(name string) (uint64, bool)
- type Builder
- type ConcatIterator
- type Iterator
- type Table
- func (t *Table) Biggest() []byte
- func (t *Table) Close() error
- func (t *Table) DecrRef() error
- func (t *Table) DoesNotHave(key []byte) bool
- func (t *Table) Filename() string
- func (t *Table) ID() uint64
- func (t *Table) IncrRef()
- func (t *Table) NewIterator(reversed bool) *Iterator
- func (t *Table) Size() int64
- func (t *Table) Smallest() []byte
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func IDToFilename ¶
IDToFilename does the inverse of ParseFileID
func NewFilename ¶
NewFilename should be named TableFilepath -- it combines the dir with the ID to make a table filepath.
func ParseFileID ¶
ParseFileID reads the file id out of a filename.
Types ¶
type Builder ¶
type Builder struct {
// contains filtered or unexported fields
}
Builder is used in building a table.
func (*Builder) Add ¶
func (b *Builder) Add(key []byte, value y.ValueStruct) error
Add adds a key-value pair to the block. If doNotRestart is true, we will not restart even if b.counter >= restartInterval.
func (*Builder) ReachedCapacity ¶
ReachedCapacity returns true if we... roughly (?) reached capacity?
type ConcatIterator ¶
type ConcatIterator struct {
// contains filtered or unexported fields
}
ConcatIterator concatenates the sequences defined by several iterators. (It only works with TableIterators, probably just because it's faster to not be so generic.)
func NewConcatIterator ¶
func NewConcatIterator(tbls []*Table, reversed bool) *ConcatIterator
NewConcatIterator creates a new concatenated iterator
func (*ConcatIterator) Seek ¶
func (s *ConcatIterator) Seek(key []byte)
Seek brings us to element >= key if reversed is false. Otherwise, <= key.
func (*ConcatIterator) Value ¶
func (s *ConcatIterator) Value() y.ValueStruct
Value implements y.Interface
type Iterator ¶
type Iterator struct {
// contains filtered or unexported fields
}
Iterator is an iterator for a Table.
func (*Iterator) Value ¶
func (itr *Iterator) Value() (ret y.ValueStruct)
Value follows the y.Iterator interface
type Table ¶
Table represents a loaded table file with the info we have about it
func OpenTable ¶
OpenTable assumes file has only one table and opens it. Takes ownership of fd upon function entry. Returns a table with one reference count on it (decrementing which may delete the file! -- consider t.Close() instead). The fd has to writeable because we call Truncate on it before deleting.
func (*Table) DoesNotHave ¶
DoesNotHave returns true if (but not "only if") the table does not have the key. It does a bloom filter lookup.
func (*Table) IncrRef ¶
func (t *Table) IncrRef()
IncrRef increments the refcount (having to do with whether the file should be deleted)
func (*Table) NewIterator ¶
NewIterator returns a new iterator of the Table