Documentation
¶
Overview ¶
Package moss stands for "memory-oriented sorted segments", and provides a data structure that manages an ordered Collection of key-val entries, with optional persistence.
The design is similar to a simplified LSM tree (log structured merge tree), but is more like a "LSM array", in that a stack of immutable, sorted key-val arrays or "segments" is maintained. When there's an incoming Batch of key-val mutations (see: ExecuteBatch()), the Batch, which is an array of key-val mutations, is sorted in-place and becomes an immutable "segment". Then, the segment is atomically pushed onto a stack of segment pointers. A higher segment in the stack will shadow mutations of the same key from lower segments.
Separately, an asynchronous goroutine (the "merger") will continuously merge N sorted segments to keep stack height low.
In the best case, a remaining, single, large sorted segment will be efficient in memory usage and efficient for binary search and range iteration.
Iterations when the stack height is > 1 are implementing using a N-way heap merge.
A Batch and a segment is actually two arrays: a byte array of contiguous key-val entries; and an uint64 array of entry offsets and key-val lengths that refer to the previous key-val entries byte array.
In this design, stacks are treated as immutable via a copy-on-write approach whenever a stack is "modified". So, readers and writers essentially don't block each other, and taking a Snapshot is also a relatively simple operation of atomically cloning the stack of segment pointers.
Of note: mutations are only supported through Batch operations, which acknowledges the common practice of using batching to achieve higher write performance and embraces it. Additionally, higher performance can be attained by using the batch memory pre-allocation parameters and the Batch.Alloc() API, allowing applications to serialize keys and vals directly into memory maintained by a batch, which can avoid extra memory copying.
IMPORTANT: The keys in a Batch must be unique. That is, myBatch.Set("x", "foo"); myBatch.Set("x", "bar") is not supported. Applications that do not naturally meet this requirement might maintain their own map[key]val data structures to ensure this uniqueness constraint.
An optional, asynchronous persistence goroutine (the "persister") can drain mutations to a lower level, ordered key-value storage layer. An optional, built-in storage layer ("mossStore") is available, that will asynchronously write segments to the end of a file (append only design), with reads performed using mmap(), and with user controllable compaction configuration. See: OpenStoreCollection().
NOTE: the mossStore persistence design does not currently support moving files created on one machine endian'ness type to another machine with a different endian'ness type.
Index ¶
- Constants
- Variables
- func ByteSliceToUint64Slice(in []byte) ([]uint64, error)
- func FormatFName(seq int64) string
- func HeaderLength() uint64
- func OpenStoreCollection(dir string, options StoreOptions, persistOptions StorePersistOptions) (*Store, Collection, error)
- func ParseFNameSeq(fname string) (int64, error)
- func ToOsFile(f File) *os.File
- func Uint64SliceToByteSlice(in []uint64) ([]byte, error)
- type Batch
- type BatchOptions
- type Collection
- type CollectionOptions
- type CollectionStats
- type CompactionConcern
- type EntryEx
- type Event
- type EventKind
- type File
- type FileRef
- type Footer
- func (f *Footer) AddRef()
- func (f *Footer) ChildCollectionNames() ([]string, error)
- func (f *Footer) ChildCollectionSnapshot(childCollectionName string) (Snapshot, error)
- func (f *Footer) Close() error
- func (f *Footer) DecRef()
- func (f *Footer) Get(key []byte, readOptions ReadOptions) ([]byte, error)
- func (f *Footer) Length() uint64
- func (f *Footer) StartIterator(startKeyIncl, endKeyExcl []byte, iteratorOptions IteratorOptions) (Iterator, error)
- type Header
- type InitCloser
- type Iterator
- type IteratorOptions
- type LowerLevelUpdate
- type MergeOperator
- type MergeOperatorStringAppend
- type OpenFile
- type OsFile
- type ReadOptions
- type Segment
- type SegmentCursor
- type SegmentLoaderFunc
- type SegmentLoc
- type SegmentLocs
- type SegmentMutator
- type SegmentPersister
- type SegmentPersisterFunc
- type SegmentStackStats
- type SegmentValidater
- type Snapshot
- type SnapshotWrapper
- func (w *SnapshotWrapper) ChildCollectionNames() ([]string, error)
- func (w *SnapshotWrapper) ChildCollectionSnapshot(childCollectionName string) (Snapshot, error)
- func (w *SnapshotWrapper) Close() (err error)
- func (w *SnapshotWrapper) Get(key []byte, readOptions ReadOptions) ([]byte, error)
- func (w *SnapshotWrapper) StartIterator(startKeyInclusive, endKeyExclusive []byte, iteratorOptions IteratorOptions) (Iterator, error)
- type Store
- func (s *Store) AddRef()
- func (s *Store) Close() error
- func (s *Store) CloseEx(options StoreCloseExOptions) error
- func (s *Store) Dir() string
- func (s *Store) Histograms() ghistogram.Histograms
- func (s *Store) IsAborted() bool
- func (s *Store) OpenCollection(options StoreOptions, persistOptions StorePersistOptions) (Collection, error)
- func (s *Store) Options() StoreOptions
- func (s *Store) Persist(higher Snapshot, persistOptions StorePersistOptions) (Snapshot, error)
- func (s *Store) Snapshot() (Snapshot, error)
- func (s *Store) SnapshotPrevious(ss Snapshot) (Snapshot, error)
- func (s *Store) SnapshotRevert(revertTo Snapshot) error
- func (s *Store) Stats() (map[string]interface{}, error)
- type StoreCloseExOptions
- type StoreOptions
- type StorePersistOptions
- type WriteOptions
Constants ¶
const OperationDel = uint64(0x0200000000000000)
OperationDel removes the value associated with the key.
const OperationMerge = uint64(0x0300000000000000)
OperationMerge merges the new value with the existing value associated with the key, as described by the configured MergeOperator.
const OperationSet = uint64(0x0100000000000000)
OperationSet replaces the value associated with the key.
Variables ¶
var AllocationGranularity = 65536 // 64kiB.
Windows MapViewOfFile() API (rough equivalent of mmap()), requires offsets to be multiples of an allocation granularity, which is up to 64kiB (or, larger than the usual 4KB page size).
var CompactionAllow = CompactionConcern(1)
CompactionAllow means compaction decision is automated and based on the configed policy and parameters, such as CompactionPercentage.
var CompactionDisable = CompactionConcern(0)
CompactionDisable means no compaction.
var CompactionForce = CompactionConcern(2)
CompactionForce means compaction should be performed immediately.
var DefaultCollectionOptions = CollectionOptions{ MergeOperator: nil, MinMergePercentage: 0.8, MaxPreMergerBatches: 10, MergerCancelCheckEvery: 10000, MergerIdleRunTimeoutMS: 0, Debug: 0, Log: nil, }
DefaultCollectionOptions are the default configuration options.
var DefaultNaiveSeekToMaxTries = 100
DefaultNaiveSeekToMaxTries is the max number of attempts a forward iterator.SeekTo() will loop using simple Next()'s before giving up and starting a binary search for a given, forward seekToKey.
var DefaultPersistKind = SegmentKindBasic
DefaultPersistKind determines which persistence Kind to choose when none is specified in StoreOptions.
var DefaultStoreOptions = StoreOptions{
CompactionPercentage: 0.65,
CompactionLevelMaxSegments: 4,
CompactionLevelMultiplier: 9,
CompactionBufferPages: 512,
CompactionSyncAfterBytes: 16000000,
SegmentKeysIndexMaxBytes: 100000,
SegmentKeysIndexMinKeyBytes: 10000000,
}
DefaultStoreOptions are the default store options when the application hasn't provided a meaningful configuration value. Advanced applications can use these to fine tune performance.
var ErrAborted = errors.New("operation-aborted")
ErrAborted is returned when any operations are aborted.
var ErrAllocTooLarge = errors.New("alloc-too-large")
ErrAllocTooLarge is returned when the requested allocation cannot be satisfied by the pre-allocated buffer.
var ErrAlreadyInitialized = errors.New("already-initialized")
ErrAlreadyInitialized is returned when initialization was attempted on an already initialized object.
var ErrBadCollectionName = errors.New("bad-collection-name")
ErrBadCollectionName is returned when the child collection name is invalid, for example "".
var ErrCanceled = errors.New("canceled")
ErrCanceled is used when an operation has been canceled.
var ErrClosed = errors.New("closed")
ErrClosed is returned when the collection is already closed.
var ErrIteratorDone = errors.New("iterator-done")
ErrIteratorDone is returned when the iterator has reached the end range of the iterator or the end of the collection.
var ErrKeyTooLarge = errors.New("key-too-large")
ErrKeyTooLarge is returned when the length of the key exceeds the limit of 2^24.
var ErrMaxTries = errors.New("max-tries")
ErrMaxTries is returned when a max number of tries or attempts for some operation has been reached.
var ErrMergeOperatorFullMergeFailed = errors.New("merge-operator-full-merge-failed")
ErrMergeOperatorFullMergeFailed is returned when the provided MergeOperator fails during the FullMerge operations.
var ErrMergeOperatorNil = errors.New("merge-operator-nil")
ErrMergeOperatorNil is returned if a merge operation is performed without specifying a MergeOperator in the CollectionOptions.
var ErrNoSuchCollection = errors.New("no-such-collection")
ErrNoSuchCollection is returned when attempting to access or delete an unknown child collection.
ErrNoValidFooter is returned when a valid footer could not be found in a file.
var ErrNothingToCompact = errors.New("nothing-to-compact")
ErrNothingToCompact is an internal error returned when compact is called on a store that is already compacted.
var ErrSegmentCorrupted = errors.New("segment-corrupted")
ErrSegmentCorrupted is returned upon any segment corruptions.
var ErrUnexpected = errors.New("unexpected")
ErrUnexpected is returned on an unexpected situation.
var ErrUnimplemented = errors.New("unimplemented")
ErrUnimplemented is returned when an unimplemented feature has been used.
var ErrValueTooLarge = errors.New("value-too-large")
ErrValueTooLarge is returned when the length of the value exceeds the limit of 2^28.
var EventKindBatchExecute = EventKind(6)
EventKindBatchExecute is fired when a collection has finished executing a batch.
var EventKindBatchExecuteStart = EventKind(5)
EventKindBatchExecuteStart is fired when a collection is starting to execute a batch.
var EventKindClose = EventKind(2)
EventKindClose is fired when a collection has been fully closed.
var EventKindCloseStart = EventKind(1)
EventKindCloseStart is fired when a collection.Close() has begun. The closing might take awhile to complete and an EventKindClose will follow later.
var EventKindMergerProgress = EventKind(3)
EventKindMergerProgress is fired when the merger has completed a round of merge processing.
var EventKindPersisterProgress = EventKind(4)
EventKindPersisterProgress is fired when the persister has completed a round of persistence processing.
var IsTimingCoarse = true
IsTimingCoarse is true on Windows because of the granularity of the time resolution can be as large as 15ms. So this variable can help false failures in unit tests. See https://stackoverflow.com/a/4019164.
var SegmentKindBasic = "a"
SegmentKindBasic is the code for a basic, persistable segment implementation, which represents a segment as two arrays: an array of contiguous key-val bytes [key0, val0, key1, val1, ... keyN, valN], and an array of offsets plus lengths into the first array.
var SegmentLoaders = map[string]SegmentLoaderFunc{}
SegmentLoaders is a registry of available segment loaders, which should be immutable after process init()'ialization. It is keyed by SegmentLoc.Kind.
var SegmentPersisters = map[string]SegmentPersisterFunc{}
SegmentPersisters is a registry of available segment persisters, which should be immutable after process init()'ialization. It is keyed by SegmentLoc.Kind.
var SkipStats bool
SkipStats allows advanced applications that don't care about correct stats to avoid some stats maintenance overhead. Defaults to false (stats are correctly maintained).
var StoreEndian = binary.LittleEndian
StoreEndian is the preferred endianness used by moss
var StoreMagicBeg = []byte("0m1o2s")
StoreMagicBeg is the magic byte sequence at the start of a footer
var StoreMagicEnd = []byte("3s4p5s")
StoreMagicEnd is the magic byte sequence at the end of a footer
var StorePageSize = 4096
StorePageSize is the page size used by moss
var StorePrefix = "data-"
StorePrefix is the file name prefix
var StoreSuffix = ".moss"
StoreSuffix is the file name suffix
var StoreVersion = uint32(4)
StoreVersion must be bumped whenever the file format changes.
Functions ¶
func ByteSliceToUint64Slice ¶
ByteSliceToUint64Slice gives access to []byte as []uint64. By default, an efficient O(1) implementation of this function is used, but which requires the unsafe package. See the "safe" build tag to use an O(N) implementation that does not need the unsafe package.
func FormatFName ¶
FormatFName returns a file name like "data-000123.moss" given a seq of 123.
func OpenStoreCollection ¶
func OpenStoreCollection(dir string, options StoreOptions, persistOptions StorePersistOptions) (*Store, Collection, error)
OpenStoreCollection returns collection based on a persisted store in a directory. Updates to the collection will be persisted. An empty directory starts an empty collection. Both the store and collection should be closed by the caller when done.
func ParseFNameSeq ¶
ParseFNameSeq parses a file name like "data-000123.moss" into 123.
func Uint64SliceToByteSlice ¶
Uint64SliceToByteSlice gives access to []uint64 as []byte. By default, an efficient O(1) implementation of this function is used, but which requires the unsafe package. See the "safe" build tag to use an O(N) implementation that does not need the unsafe package.
Types ¶
type Batch ¶
type Batch interface { // Close must be invoked to release resources. Close() error // Set creates or updates an key-val entry in the Collection. The // key must be unique (not repeated) within the Batch. Set() // copies the key and val bytes into the Batch, so the memory // bytes of the key and val may be reused by the caller. Set(key, val []byte) error // Del deletes a key-val entry from the Collection. The key must // be unique (not repeated) within the Batch. Del copies the key // bytes into the Batch, so the memory bytes of the key may be // reused by the caller. Del() on a non-existent key results in a // nil error. Del(key []byte) error // Merge creates or updates a key-val entry in the Collection via // the MergeOperator defined in the CollectionOptions. The key // must be unique (not repeated) within the Batch. Merge() copies // the key and val bytes into the Batch, so the memory bytes of // the key and val may be reused by the caller. Merge(key, val []byte) error // Alloc provides a slice of bytes "owned" by the Batch, to reduce // extra copying of memory. See the Collection.NewBatch() method. Alloc(numBytes int) ([]byte, error) // AllocSet is like Set(), but the caller must provide []byte // parameters that came from Alloc(). AllocSet(keyFromAlloc, valFromAlloc []byte) error // AllocDel is like Del(), but the caller must provide []byte // parameters that came from Alloc(). AllocDel(keyFromAlloc []byte) error // AllocMerge is like Merge(), but the caller must provide []byte // parameters that came from Alloc(). AllocMerge(keyFromAlloc, valFromAlloc []byte) error // NewChildCollectionBatch returns a new Batch instance with preallocated // resources for a specific child collection given its unique name. // The child Batch will be executed atomically along with any // other child batches and with the top-level Batch // when the top-level Batch is executed. // The child collection name should not start with a '.' (period) // as those are reserved for future moss usage. NewChildCollectionBatch(collectionName string, options BatchOptions) (Batch, error) // DelChildCollection records a child collection deletion given the name. // It only takes effect when the top-level batch is executed. DelChildCollection(collectionName string) error }
A Batch is a set of mutations that will be incorporated atomically into a Collection. NOTE: the keys in a Batch must be unique.
Concurrent Batch's are allowed, but to avoid races, concurrent Batches should only be used by concurrent goroutines that can ensure the mutation keys are partitioned or non-overlapping between Batch instances.
type BatchOptions ¶
BatchOptions are provided to NewChildCollectionBatch().
type Collection ¶
type Collection interface { // Start kicks off required background tasks. Start() error // Close synchronously stops background tasks and releases // resources. Close() error // Options returns the options currently being used. Options() CollectionOptions // Snapshot returns a stable Snapshot of the key-value entries. Snapshot() (Snapshot, error) // Get retrieves a value from the collection for a given key // and returns nil if the key is not found. Get(key []byte, readOptions ReadOptions) ([]byte, error) // NewBatch returns a new Batch instance with preallocated // resources. See the Batch.Alloc() method. NewBatch(totalOps, totalKeyValBytes int) (Batch, error) // ExecuteBatch atomically incorporates the provided Batch into // the Collection. The Batch instance should be Close()'ed and // not reused after ExecuteBatch() returns. ExecuteBatch(b Batch, writeOptions WriteOptions) error // Stats returns stats for this collection. Note that stats might // be updated asynchronously. Stats() (*CollectionStats, error) // Histograms returns a snapshot of the histograms for this // collection. Note that histograms might be updated // asynchronously. Histograms() ghistogram.Histograms }
A Collection represents an ordered mapping of key-val entries, where a Collection is snapshot'able and atomically updatable.
func NewCollection ¶
func NewCollection(options CollectionOptions) ( Collection, error)
NewCollection returns a new, unstarted Collection instance.
type CollectionOptions ¶
type CollectionOptions struct { // MergeOperator is an optional func provided by an application // that wants to use Batch.Merge()'ing. MergeOperator MergeOperator `json:"-"` // DeferredSort allows ExecuteBatch() to operate more quickly by // deferring the sorting of an incoming batch until it is needed // by a reader. The tradeoff is that later read operations can // take longer as the sorting is finally performed. DeferredSort bool // MinMergePercentage allows the merger to avoid premature merging // of segments that are too small, where a segment X has to reach // a certain size percentage compared to the next lower segment // before segment X (and all segments above X) will be merged. MinMergePercentage float64 // MaxPreMergerBatches is the max number of batches that can be // accepted into the collection through ExecuteBatch() and held // for merging but that have not been actually processed by the // merger yet. When the number of held but unprocessed batches // reaches MaxPreMergerBatches, then ExecuteBatch() will block to // allow the merger to catch up. MaxPreMergerBatches int // MergerCancelCheckEvery is the number of ops the merger will // perform before it checks to see if a merger operation was // canceled. MergerCancelCheckEvery int // MergerIdleRunTimeoutMS is the idle time in milliseconds after which the // background merger will perform an "idle run" which can trigger // incremental compactions to speed up queries. MergerIdleRunTimeoutMS int64 // MaxDirtyOps, when greater than zero, is the max number of dirty // (unpersisted) ops allowed before ExecuteBatch() blocks to allow // the persister to catch up. It only has effect with a non-nil // LowerLevelUpdate. MaxDirtyOps uint64 // MaxDirtyKeyValBytes, when greater than zero, is the max number // of dirty (unpersisted) key-val bytes allowed before // ExecuteBatch() blocks to allow the persister to catch up. It // only has effect with a non-nil LowerLevelUpdate. MaxDirtyKeyValBytes uint64 // CachePersisted allows the collection to cache clean, persisted // key-val's, and is considered when LowerLevelUpdate is used. CachePersisted bool // LowerLevelInit is an optional Snapshot implementation that // initializes the lower-level storage of a Collection. This // might be used, for example, for having a Collection be a // write-back cache in front of a persistent implementation. LowerLevelInit Snapshot `json:"-"` // LowerLevelUpdate is an optional func that is invoked when the // lower-level storage should be updated. LowerLevelUpdate LowerLevelUpdate `json:"-"` Debug int // Higher means more logging, when Log != nil. // Log is a callback invoked when the Collection needs to log a // debug message. Optional, may be nil. Log func(format string, a ...interface{}) `json:"-"` // OnError is an optional callback invoked when the Collection // encounters an error. This might happen when the background // goroutines of moss encounter errors, such as during segment // merging or optional persistence operations. OnError func(error) `json:"-"` // OnEvent is an optional callback invoked on Collection related // processing events. If the application's callback // implementation blocks, it may pause processing and progress, // depending on the type of callback event kind. OnEvent func(event Event) `json:"-"` // ReadOnly means that persisted data and storage files if any, // will remain unchanged. ReadOnly bool }
CollectionOptions allows applications to specify config settings.
type CollectionStats ¶
type CollectionStats struct { TotOnError uint64 TotCloseBeg uint64 TotCloseMergerDone uint64 TotClosePersisterDone uint64 TotCloseLowerLevelBeg uint64 TotCloseLowerLevelEnd uint64 TotCloseEnd uint64 TotSnapshotBeg uint64 TotSnapshotEnd uint64 TotSnapshotInternalBeg uint64 TotSnapshotInternalEnd uint64 TotSnapshotInternalClose uint64 TotGet uint64 TotGetErr uint64 TotNewBatch uint64 TotNewBatchTotalOps uint64 TotNewBatchTotalKeyValBytes uint64 TotExecuteBatchBeg uint64 TotExecuteBatchErr uint64 TotExecuteBatchEmpty uint64 TotExecuteBatchWaitBeg uint64 TotExecuteBatchWaitEnd uint64 TotExecuteBatchAwakeMergerBeg uint64 TotExecuteBatchAwakeMergerEnd uint64 TotExecuteBatchEnd uint64 TotNotifyMergerBeg uint64 TotNotifyMergerEnd uint64 TotMergerEnd uint64 TotMergerLoop uint64 TotMergerLoopRepeat uint64 TotMergerAll uint64 TotMergerInternalBeg uint64 TotMergerInternalErr uint64 TotMergerInternalEnd uint64 TotMergerInternalSkip uint64 TotMergerLowerLevelNotify uint64 TotMergerLowerLevelNotifySkip uint64 TotMergerEmptyDirtyMid uint64 TotMergerWaitIncomingBeg uint64 TotMergerWaitIncomingStop uint64 TotMergerWaitIncomingEnd uint64 TotMergerWaitIncomingSkip uint64 TotMergerIdleSleeps uint64 TotMergerIdleRuns uint64 TotMergerWaitOutgoingBeg uint64 TotMergerWaitOutgoingStop uint64 TotMergerWaitOutgoingEnd uint64 TotMergerWaitOutgoingSkip uint64 TotPersisterLoop uint64 TotPersisterLoopRepeat uint64 TotPersisterWaitBeg uint64 TotPersisterWaitEnd uint64 TotPersisterEnd uint64 TotPersisterLowerLevelUpdateBeg uint64 TotPersisterLowerLevelUpdateErr uint64 TotPersisterLowerLevelUpdateEnd uint64 CurDirtyOps uint64 CurDirtyBytes uint64 CurDirtySegments uint64 CurDirtyTopOps uint64 CurDirtyTopBytes uint64 CurDirtyTopSegments uint64 CurDirtyMidOps uint64 CurDirtyMidBytes uint64 CurDirtyMidSegments uint64 CurDirtyBaseOps uint64 CurDirtyBaseBytes uint64 CurDirtyBaseSegments uint64 CurCleanOps uint64 CurCleanBytes uint64 CurCleanSegments uint64 }
CollectionStats fields that are prefixed like CurXxxx are gauges (can go up and down), and fields that are prefixed like TotXxxx are monotonically increasing counters.
func (*CollectionStats) AtomicCopyTo ¶
func (s *CollectionStats) AtomicCopyTo(r *CollectionStats)
AtomicCopyTo copies stats from s to r (from source to result).
type CompactionConcern ¶
type CompactionConcern int
CompactionConcern is a type representing various possible compaction behaviors associated with persistence.
type EntryEx ¶
type EntryEx struct { // Operation is an OperationXxx const. Operation uint64 }
EntryEx provides extra, advanced information about an entry from the Iterator.CurrentEx() method.
type Event ¶
type Event struct { Kind EventKind Collection Collection Duration time.Duration }
Event represents the information provided in an OnEvent() callback.
type File ¶
type File interface { io.ReaderAt io.WriterAt io.Closer Stat() (os.FileInfo, error) Sync() error Truncate(size int64) error }
The File interface is implemented by os.File. App specific implementations may add concurrency, caching, stats, fuzzing, etc.
type FileRef ¶
type FileRef struct {
// contains filtered or unexported fields
}
FileRef provides a ref-counting wrapper around a File.
func (*FileRef) Close ¶
Close allows the FileRef to implement the io.Closer interface. It actually just performs what should be the final DecRef() call which takes the reference count to 0. Once 0, it allows the file to actually be closed.
func (*FileRef) DecRef ¶
DecRef decreases the ref-count on the file ref, and closing the underlying file when the ref-count reaches zero.
func (*FileRef) FetchRefCount ¶
FetchRefCount fetches the ref-count on the file ref.
func (*FileRef) OnAfterClose ¶
func (r *FileRef) OnAfterClose(cb func())
OnAfterClose registers event callback func's that are invoked after the file is closed.
func (*FileRef) OnBeforeClose ¶
func (r *FileRef) OnBeforeClose(cb func())
OnBeforeClose registers event callback func's that are invoked before the file is closed.
type Footer ¶
type Footer struct { // contains filtered or unexported fields }
Footer represents a footer record persisted in a file, and also implements the moss.Snapshot interface.
func ReadFooter ¶
func ReadFooter(options *StoreOptions, file File) (*Footer, error)
ReadFooter reads the last valid Footer from a file.
func ScanFooter ¶
ScanFooter scans a file backwards from the given pos for a valid Footer, adding ref-counts to fref on success.
func (*Footer) ChildCollectionNames ¶
ChildCollectionNames returns an array of child collection name strings.
func (*Footer) ChildCollectionSnapshot ¶
ChildCollectionSnapshot returns a Snapshot on a given child collection by its name.
func (*Footer) Get ¶
func (f *Footer) Get(key []byte, readOptions ReadOptions) ([]byte, error)
Get retrieves a val from the footer, and will return nil val if the entry does not exist in the footer.
func (*Footer) StartIterator ¶
func (f *Footer) StartIterator(startKeyIncl, endKeyExcl []byte, iteratorOptions IteratorOptions) (Iterator, error)
StartIterator returns a new Iterator instance on this footer.
On success, the returned Iterator will be positioned so that Iterator.Current() will either provide the first entry in the range or ErrIteratorDone.
A startKeyIncl of nil means the logical "bottom-most" possible key and an endKeyExcl of nil means the logical "top-most" possible key.
type Header ¶
type Header struct { Version uint32 // The file format / StoreVersion. CreatedAt string CreatedEndian string // The endian() of the file creator. }
Header represents the JSON stored at the head of a file, where the file header bytes should be less than StorePageSize length.
type InitCloser ¶
An InitCloser holds onto an io.Closer, and is used for chaining io.Closer's. That is, we often want the closing of one resource to close related resources.
type Iterator ¶
type Iterator interface { // Close must be invoked to release resources. Close() error // Next moves the Iterator to the next key-val entry and will // return ErrIteratorDone if the Iterator is done. Next() error // SeekTo moves the Iterator to the lowest key-val entry whose key // is >= the given seekToKey, and will return ErrIteratorDone if // the Iterator is done. SeekTo() will respect the // startKeyInclusive/endKeyExclusive bounds, if any, that were // specified with StartIterator(). Seeking to before the // startKeyInclusive will end up on the first key. Seeking to or // after the endKeyExclusive will result in ErrIteratorDone. SeekTo(seekToKey []byte) error // Current returns ErrIteratorDone if the iterator is done. // Otherwise, Current() returns the current key and val, which // should be treated as immutable or read-only. The key and val // bytes will remain available until the next call to Next() or // Close(). Current() (key, val []byte, err error) // CurrentEx is a more advanced form of Current() that returns // more metadata for each entry. It is more useful when used with // IteratorOptions.IncludeDeletions of true. It returns // ErrIteratorDone if the iterator is done. Otherwise, the // current EntryEx, key, val are returned, which should be treated // as immutable or read-only. CurrentEx() (entryEx EntryEx, key, val []byte, err error) }
An Iterator allows enumeration of key-val entries.
type IteratorOptions ¶
type IteratorOptions struct { // IncludeDeletions is an advanced flag that specifies that an // Iterator should include deletion operations in its enuemration. // See also the Iterator.CurrentEx() method. IncludeDeletions bool // SkipLowerLevel is an advanced flag that specifies that an // Iterator should not enumerate key-val entries from the // optional, chained, lower-level iterator. See // CollectionOptions.LowerLevelInit/LowerLevelUpdate. SkipLowerLevel bool // MinSegmentLevel is an advanced parameter that specifies that an // Iterator should skip segments at a level less than // MinSegmentLevel. MinSegmentLevel is 0-based level, like an // array index. MinSegmentLevel int // MaxSegmentHeight is an advanced parameter that specifies that // an Iterator should skip segments at a level >= than // MaxSegmentHeight. MaxSegmentHeight is 1-based height, like an // array length. MaxSegmentHeight int // contains filtered or unexported fields }
IteratorOptions are provided to StartIterator().
type LowerLevelUpdate ¶
LowerLevelUpdate is the func callback signature used when a Collection wants to update its optional, lower-level storage.
type MergeOperator ¶
type MergeOperator interface { // Name returns an identifier for this merge operator, which might // be used for logging / debugging. Name() string // FullMerge the full sequence of operands on top of an // existingValue and returns the merged value. The existingValue // may be nil if no value currently exists. If full merge cannot // be done, return (nil, false). FullMerge(key, existingValue []byte, operands [][]byte) ([]byte, bool) // Partially merge two operands. If partial merge cannot be done, // return (nil, false), which will defer processing until a later // FullMerge(). PartialMerge(key, leftOperand, rightOperand []byte) ([]byte, bool) }
A MergeOperator may be implemented by applications that wish to optimize their read-compute-write use cases. Write-heavy counters, for example, could be implemented efficiently by using the MergeOperator functionality.
type MergeOperatorStringAppend ¶
type MergeOperatorStringAppend struct { Sep string // The separator string between operands. // contains filtered or unexported fields }
MergeOperatorStringAppend implements a simple merger that appends strings. It was originally built for testing and sample purposes.
func (*MergeOperatorStringAppend) FullMerge ¶
func (mo *MergeOperatorStringAppend) FullMerge(key, existingValue []byte, operands [][]byte) ([]byte, bool)
FullMerge performs the full merge of a string append operation
func (*MergeOperatorStringAppend) Name ¶
func (mo *MergeOperatorStringAppend) Name() string
Name returns the name of this merge operator implemenation
func (*MergeOperatorStringAppend) PartialMerge ¶
func (mo *MergeOperatorStringAppend) PartialMerge(key, leftOperand, rightOperand []byte) ([]byte, bool)
PartialMerge performs the partial merge of a string append operation
type ReadOptions ¶
type ReadOptions struct { // By default, the value returned during lookups or Get()'s are // copied. Specifying true for NoCopyValue means don't copy the // value bytes, where the caller should copy the value themselves // if they need the value after the lifetime of the enclosing // snapshot. When true, the caller must treat the value returned // by a lookup/Get() as immutable. NoCopyValue bool // SkipLowerLevel is an advanced flag that specifies that a // point lookup should fail on a cache-miss and not attempt to access // key-val entries from the optional, chained, // lower-level snapshot (disk based). See // CollectionOptions.LowerLevelInit/LowerLevelUpdate. SkipLowerLevel bool }
ReadOptions are provided to Snapshot.Get().
type Segment ¶
type Segment interface { // Returns the kind of segment, used for persistence. Kind() string // Len returns the number of ops in the segment. Len() int // NumKeyValBytes returns the number of bytes used for key-val data. NumKeyValBytes() (uint64, uint64) // Get returns the operation and value associated with the given key. // If the key does not exist, the operation is 0, and the val is nil. // If an error occurs it is returned instead of the operation and value. Get(key []byte) (operation uint64, val []byte, err error) // Cursor returns an SegmentCursor that will iterate over entries // from the given (inclusive) start key, through the given (exclusive) // end key. Cursor(startKeyInclusive []byte, endKeyExclusive []byte) (SegmentCursor, error) // Returns true if the segment is already sorted, and returns // false if the sorting is only asynchronously scheduled. RequestSort(synchronous bool) bool }
A Segment represents the read-oriented interface for a segment.
type SegmentCursor ¶
type SegmentCursor interface { // Current returns the operation/key/value pointed to by the cursor. Current() (operation uint64, key []byte, val []byte) // Seek advances current to point to specified key. // If the seek key is less than the original startKeyInclusive // used to create this cursor, it will seek to that startKeyInclusive // instead. // If the cursor is not pointing at a valid entry ErrIteratorDone // is returned. Seek(startKeyInclusive []byte) error // Next moves the cursor to the next entry. If there is no Next // entry, ErrIteratorDone is returned. Next() error }
A SegmentCursor represents a handle for iterating through consecutive op/key/value tuples.
type SegmentLoaderFunc ¶
type SegmentLoaderFunc func( sloc *SegmentLoc) (Segment, error)
A SegmentLoaderFunc is able to load a segment from a SegmentLoc.
type SegmentLoc ¶
type SegmentLoc struct { Kind string // Used as the key for SegmentLoaders. KvsOffset uint64 // Byte offset within the file. KvsBytes uint64 // Number of bytes for the persisted segment.kvs. BufOffset uint64 // Byte offset within the file. BufBytes uint64 // Number of bytes for the persisted segment.buf. TotOpsSet uint64 TotOpsDel uint64 TotKeyByte uint64 TotValByte uint64 // contains filtered or unexported fields }
SegmentLoc represents a persisted segment.
func (*SegmentLoc) TotOps ¶
func (sloc *SegmentLoc) TotOps() int
TotOps returns number of ops in a segment loc.
type SegmentLocs ¶
type SegmentLocs []SegmentLoc
SegmentLocs represents a slice of SegmentLoc
func (SegmentLocs) AddRef ¶
func (slocs SegmentLocs) AddRef()
AddRef increases the ref count on each SegmentLoc in this SegmentLocs
func (SegmentLocs) Close ¶
func (slocs SegmentLocs) Close() error
Close allows the SegmentLocs to implement the io.Closer interface. It actually just performs what should be the final DecRef() call which takes the reference count to 0.
func (SegmentLocs) DecRef ¶
func (slocs SegmentLocs) DecRef()
DecRef decreases the ref count on each SegmentLoc in this SegmentLocs
type SegmentMutator ¶
A SegmentMutator represents the mutation methods of a segment.
type SegmentPersister ¶
type SegmentPersister interface {
Persist(file File, options *StoreOptions) (SegmentLoc, error)
}
A SegmentPersister represents a segment that can be persisted.
type SegmentPersisterFunc ¶
type SegmentPersisterFunc func( s Segment, f File, pos int64, options *StoreOptions) (SegmentLoc, error)
A SegmentPersisterFunc is able to persist a segment to a file, and return a SegmentLoc describing it.
type SegmentStackStats ¶
type SegmentStackStats struct { CurOps uint64 CurBytes uint64 // Counts key-val bytes only, not metadata. CurSegments uint64 }
SegmentStackStats represents the stats for a segmentStack.
func (*SegmentStackStats) AddTo ¶
func (sss *SegmentStackStats) AddTo(dest *SegmentStackStats)
AddTo adds the values from this SegmentStackStats to the dest SegmentStackStats.
type SegmentValidater ¶
type SegmentValidater interface { // Valid examines the state of the segment, any problem is returned // as an error. Valid() error }
SegmentValidater is an optional interface that can be implemented by any Segment to allow additional validation in test cases. The method of this interface is NOT invoked during the normal runtime usage of a Segment.
type Snapshot ¶
type Snapshot interface { // Close must be invoked to release resources. Close() error // Get retrieves a val from the Snapshot, and will return nil val // if the entry does not exist in the Snapshot. Get(key []byte, readOptions ReadOptions) ([]byte, error) // StartIterator returns a new Iterator instance on this Snapshot. // // On success, the returned Iterator will be positioned so that // Iterator.Current() will either provide the first entry in the // range or ErrIteratorDone. // // A startKeyInclusive of nil means the logical "bottom-most" // possible key and an endKeyExclusive of nil means the logical // key that's above the "top-most" possible key. StartIterator(startKeyInclusive, endKeyExclusive []byte, iteratorOptions IteratorOptions) (Iterator, error) // ChildCollectionNames returns an array of child collection name strings. ChildCollectionNames() ([]string, error) // ChildCollectionSnapshot returns a Snapshot on a given child // collection by its name. ChildCollectionSnapshot(childCollectionName string) (Snapshot, error) }
A Snapshot is a stable view of a Collection for readers, isolated from concurrent mutation activity.
type SnapshotWrapper ¶
type SnapshotWrapper struct {
// contains filtered or unexported fields
}
SnapshotWrapper implements the moss.Snapshot interface.
func NewSnapshotWrapper ¶
func NewSnapshotWrapper(ss Snapshot, closer io.Closer) *SnapshotWrapper
NewSnapshotWrapper creates a wrapper which provides ref-counting around a snapshot. The snapshot (and an optional io.Closer) will be closed when the ref-count reaches zero.
func (*SnapshotWrapper) ChildCollectionNames ¶
func (w *SnapshotWrapper) ChildCollectionNames() ([]string, error)
ChildCollectionNames returns an array of child collection name strings.
func (*SnapshotWrapper) ChildCollectionSnapshot ¶
func (w *SnapshotWrapper) ChildCollectionSnapshot(childCollectionName string) ( Snapshot, error)
ChildCollectionSnapshot returns a Snapshot on a given child collection by its name.
func (*SnapshotWrapper) Close ¶
func (w *SnapshotWrapper) Close() (err error)
Close will decRef the underlying snapshot.
func (*SnapshotWrapper) Get ¶
func (w *SnapshotWrapper) Get(key []byte, readOptions ReadOptions) ( []byte, error)
Get returns the key from the underlying snapshot.
func (*SnapshotWrapper) StartIterator ¶
func (w *SnapshotWrapper) StartIterator( startKeyInclusive, endKeyExclusive []byte, iteratorOptions IteratorOptions, ) (Iterator, error)
StartIterator initiates a start iterator over the underlying snapshot.
type Store ¶
type Store struct {
// contains filtered or unexported fields
}
Store represents data persisted in a directory.
func OpenStore ¶
func OpenStore(dir string, options StoreOptions) (*Store, error)
OpenStore returns a store instance for a directory. An empty directory results in an empty store.
func (*Store) Close ¶
Close decreases the ref count on this store, and if the count is 0 proceeds to actually close the store.
func (*Store) CloseEx ¶
func (s *Store) CloseEx(options StoreCloseExOptions) error
CloseEx provides more advanced closing options.
func (*Store) Histograms ¶
func (s *Store) Histograms() ghistogram.Histograms
Histograms returns a snapshot of the histograms for this store.
func (*Store) OpenCollection ¶
func (s *Store) OpenCollection(options StoreOptions, persistOptions StorePersistOptions) (Collection, error)
OpenCollection opens a collection based on a store. Applications should open at most a single collection per store for performing read/write work.
func (*Store) Options ¶
func (s *Store) Options() StoreOptions
Options a copy of this Store's StoreOptions
func (*Store) Persist ¶
func (s *Store) Persist(higher Snapshot, persistOptions StorePersistOptions) ( Snapshot, error)
Persist helps the store implement the lower-level-update func, and normally is not called directly by applications. The higher snapshot may be nil. Advanced users who wish to call Persist() directly MUST invoke it in single threaded manner only.
func (*Store) SnapshotPrevious ¶
SnapshotPrevious returns the next older, previous snapshot based on a given snapshot, allowing the application to walk backwards into the history of a store at previous points in time. The given snapshot must come from the same store. A nil returned snapshot means no previous snapshot is available. Of note, store compactions will trim previous history from a store.
func (*Store) SnapshotRevert ¶
SnapshotRevert atomically and durably brings the store back to the point-in-time as represented by the revertTo snapshot. SnapshotRevert() should only be passed a snapshot that came from the same store, such as from using Store.Snapshot() or Store.SnapshotPrevious().
SnapshotRevert() must not be invoked concurrently with Store.Persist(), so it is recommended that SnapshotRevert() should be invoked only after the collection has been Close()'ed, which helps ensure that you are not racing with concurrent, background persistence goroutines.
SnapshotRevert() can fail if the given snapshot is too old, especially w.r.t. compactions. For example, navigate back to an older snapshot X via SnapshotPrevious(). Then, do a full compaction. Then, SnapshotRevert(X) will give an error.
type StoreCloseExOptions ¶
type StoreCloseExOptions struct { // Abort means stop as soon as possible, even if data might be lost, // such as any mutations not yet persisted. Abort bool }
StoreCloseExOptions represents store CloseEx options.
type StoreOptions ¶
type StoreOptions struct { // CollectionOptions should be the same as used with // NewCollection(). CollectionOptions CollectionOptions // CompactionPercentage determines when a compaction will run when // CompactionConcern is CompactionAllow. When the percentage of // ops between the non-base level and the base level is greater // than CompactionPercentage, then compaction will be run. CompactionPercentage float64 // CompactionLevelMaxSegments determines the number of segments // per level exceeding which partial or full compaction will run. CompactionLevelMaxSegments int // CompactionLevelMultiplier is the factor which determines the // next level in terms of segment sizes. CompactionLevelMultiplier int // CompactionBufferPages is the number of pages to use for // compaction, where writes are buffered before flushing to disk. CompactionBufferPages int // CompactionSync of true means perform a file sync at the end of // compaction for additional safety. CompactionSync bool // CompactionSyncAfterBytes controls the number of bytes after // which compaction is allowed to invoke an file sync, followed // by an additional file sync at the end of compaction. A value // that is < 0 annulls this behavior. CompactionSyncAfterBytes int // OpenFile allows apps to optionally provide their own file // opening implementation. When nil, os.OpenFile() is used. OpenFile OpenFile `json:"-"` // Log is a callback invoked when store needs to log a debug // message. Optional, may be nil. Log func(format string, a ...interface{}) `json:"-"` // KeepFiles means that unused, obsoleted files will not be // removed during OpenStore(). Keeping old files might be useful // when diagnosing file corruption cases. KeepFiles bool // Choose which Kind of segment to persist, if unspecified defaults // to the value of DefaultPersistKind. PersistKind string // SegmentKeysIndexMaxBytes is the maximum size in bytes allowed for // the segmentKeysIndex. Also, an index will not be built if the // segment's total key bytes is less than this parameter. SegmentKeysIndexMaxBytes int // SegmentKeysIndexMinKeyBytes is the minimum size in bytes that the // keys of a segment must reach before a segment key index is built. SegmentKeysIndexMinKeyBytes int }
StoreOptions are provided to OpenStore().
type StorePersistOptions ¶
type StorePersistOptions struct { // NoSync means do not perform a file sync at the end of // persistence (before returning from the Store.Persist() method). // Using NoSync of true might provide better performance, but at // the cost of data safety. NoSync bool // CompactionConcern controls whether compaction is allowed or // forced as part of persistence. CompactionConcern CompactionConcern }
StorePersistOptions are provided to Store.Persist().
type WriteOptions ¶
type WriteOptions struct { }
WriteOptions are provided to Collection.ExecuteBatch().
Source Files
¶
- api.go
- collection.go
- collection_merger.go
- collection_stats.go
- file.go
- iterator.go
- iterator_single.go
- mmap.go
- mmap_windows.go
- persister.go
- ping.go
- segment.go
- segment_index.go
- segment_stack.go
- segment_stack_merge.go
- slice_util.go
- store.go
- store_api.go
- store_compact.go
- store_footer.go
- store_previous.go
- store_revert.go
- store_stats.go
- util_merge.go
- wrap.go