Documentation ¶
Index ¶
- type DiskSorter
- func (d *DiskSorter) Close() error
- func (d *DiskSorter) CloseAndCleanup() error
- func (d *DiskSorter) IsSorted() bool
- func (d *DiskSorter) NewIterator(_ context.Context) (Iterator, error)
- func (d *DiskSorter) NewWriter(_ context.Context) (Writer, error)
- func (d *DiskSorter) Sort(ctx context.Context) error
- type DiskSorterOptions
- type ExternalSorter
- type Iterator
- type Writer
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type DiskSorter ¶
type DiskSorter struct {
// contains filtered or unexported fields
}
DiskSorter is an external sorter that sorts data on disk.
func OpenDiskSorter ¶
func OpenDiskSorter(dirname string, opts *DiskSorterOptions) (*DiskSorter, error)
OpenDiskSorter opens a DiskSorter with the given directory.
func (*DiskSorter) Close ¶
func (d *DiskSorter) Close() error
Close implements the ExternalSorter.Close.
func (*DiskSorter) CloseAndCleanup ¶
func (d *DiskSorter) CloseAndCleanup() error
CloseAndCleanup implements the ExternalSorter.CloseAndCleanup.
func (*DiskSorter) IsSorted ¶
func (d *DiskSorter) IsSorted() bool
IsSorted implements the ExternalSorter.IsSorted.
func (*DiskSorter) NewIterator ¶
func (d *DiskSorter) NewIterator(_ context.Context) (Iterator, error)
NewIterator implements the ExternalSorter.NewIterator.
type DiskSorterOptions ¶
type DiskSorterOptions struct { // Concurrency is the maximum number of goroutines that can be used to // sort data in parallel. // // The default value is runtime.GOMAXPROCS(0). Concurrency int // WriterBufferSize is the size of the buffer used by the writer. // Larger buffer size can improve the write and sort performance, // and reduce the number of disk operations. // // The default value is 128MB. WriterBufferSize int // CompactionThreshold is maximum overlap depth necessary to trigger a // compaction. The overlap depth is the number of files that overlap at // same interval. // // For example, consider the following files: // // file 0: a-----d // file 1: b-----e // file 2: c-------g // file 3: d---f // // The overlap depth of these files is 3, because file 0, 1, 2 overlap at // the interval [c, d), and file 1, 2, 3 overlap at the interval [d, e). // // If the overlap depth reached CompactionThreshold, the sorter will compact // files to reduce the overlap depth during sorting. The larger the overlap // depth, the larger read amplification will be during iteration. This is a // trade-off between read amplification and sorting cost. Setting this value // to math.MaxInt will disable the compaction. // // The default value is 16. CompactionThreshold int // MaxCompactionDepth is the maximum files involved in a single compaction. // The minimum value is 2. Any value less than 2 will be treated as not set. // // The default value is 64. MaxCompactionDepth int // MaxCompactionSize is the maximum size of key-value pairs involved in a // single compaction. // // The default value is 512MB. MaxCompactionSize int // Logger is used to write log messages. // // The default value is log.L(). Logger *zap.Logger }
DiskSorterOptions holds the optional parameters for DiskSorter.
type ExternalSorter ¶
type ExternalSorter interface { // NewWriter creates a new writer for writing key-value pairs before sorting. // If the sorter starts sorting or is already sorted, it will return an error. // // Multiple writers can be opened and used concurrently. NewWriter(ctx context.Context) (Writer, error) // Sort sorts the key-value pairs written by the writer. // It should be called after all open writers are closed. // // Implementations should guarantee that Sort() is idempotent and atomic. // If it returns an error, or process is killed during Sort(), the sorter should be able // to recover from the error or crash, and the external storage should not be corrupted. Sort(ctx context.Context) error // IsSorted returns true if the sorter is already sorted, iterators are ready to create. IsSorted() bool // NewIterator creates a new iterator for iterating over the key-value pairs after sorting. // If the sorter is not sorted yet, it will return an error. // // Multiple iterators can be opened and used concurrently. NewIterator(ctx context.Context) (Iterator, error) // Close releases all resources held by the sorter. It will not clean up the external storage, // so the sorter can recover from a crash. Close() error // CloseAndCleanup closes the external sorter and cleans up all resources created by the sorter. CloseAndCleanup() error }
ExternalSorter is an interface for sorting key-value pairs in external storage. The key-value pairs are sorted by the key, duplicate keys are automatically removed.
type Iterator ¶
type Iterator interface { // Seek moves the iterator to the first key-value pair whose key is greater // than or equal to the given key. Seek(key []byte) bool // First moves the iterator to the first key-value pair. First() bool // Next moves the iterator to the next key-value pair. // // Implementations must guarantee the next key is greater than the current key. Next() bool // Last moves the iterator to the last key-value pair. Last() bool // Valid returns true if the iterator is positioned at a valid key-value pair. Valid() bool // Error returns the error, if any, that was encountered during iteration. Error() error // UnsafeKey returns the key of the current key-value pair, without copying. // The memory is only valid until the next call to change the iterator. UnsafeKey() []byte // UnsafeValue returns the value of the current key-value pair, without copying. // The memory is only valid until the next call to change the iterator. UnsafeValue() []byte // Close releases all resources held by the iterator. Close() error }
Iterator is an interface for iterating over the key-value pairs after sorting.
type Writer ¶
type Writer interface { // Put adds a key-value pair to the writer. // // Implementations must not modify or retain slices passed to Put(). Put(key, value []byte) error // Flush flushes all buffered key-value pairs to the external sorter. // the writer can be reused after calling Flush(). Flush() error // Close flushes all buffered key-value pairs to the external sorter, // and releases all resources held by the writer. Close() error }
Writer is an interface for writing key-value pairs to the external sorter.