storage

package
v0.7.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 17, 2015 License: BSD-3-Clause Imports: 9 Imported by: 0

Documentation

Overview

Package storage provides a unified interface to a number of storage engines. Since each storage engine has different capabilities, this package defines a number of interfaces in addition to the core Engine interface, which all storage engines should satisfy.

Keys are specified as a combination of Context and a datatype-specific byte slice, typically called an "type-specific key" (TKey) in DVID docs and code. The Context provides DVID-wide namespacing and as such, must use one of the Context implementations within the storage package. (This is enforced by making Context a Go opaque interface.) The type-specific key formatting is entirely up to the datatype designer, although use of dvid.Index is suggested.

Initially we are concentrating on key-value backends but expect to support graph and perhaps relational databases, either using specialized databases or software layers on top of an ordered key-value store.

Although we assume lexicographically ordering for range queries, there is some variation in how variable size keys are treated. We assume all storage engines, after appropriate DVID drivers, use the following definition of ordering:

A string s precedes a string t in lexicographic order if:

* s is a prefix of t, or
* if c and d are respectively the first character of s and t in which s and t differ,
  then c precedes d in character order.
* if s and t are equivalent for all of s, but t is longer

Note: For the characters that are alphabetical letters, the character order coincides
with the alphabetical order. Digits precede letters, and uppercase letters precede
lowercase ones.

Examples:

composer precedes computer
house precedes household
Household precedes house
H2O precedes HOTEL
mydex precedes mydexterity

Note that the above is different than shortlex order, which would group strings
based on length first.

The above lexicographical ordering is used by default for levedb variants.

Index

Constants

View Source
const (
	// MarkData is a byte indicating real data stored and should be the last byte of any
	// versioned key.
	MarkData = 0x03

	// MarkTombstone is a byte indicating a tombstone -- a marker for deleted data.
	MarkTombstone = 0x4F
)
View Source
const (
	TKeyMinClass = 0x00
	TKeyMaxClass = 0xFF
)
View Source
const MonitorBuffer = 10000

Variables

View Source
var (
	// Number of bytes read in last second from storage engine.
	StoreKeyBytesReadPerSec int

	// Number of bytes written in last second to storage engine.
	StoreKeyBytesWrittenPerSec int

	// Number of bytes read in last second from storage engine.
	StoreValueBytesReadPerSec int

	// Number of bytes written in last second to storage engine.
	StoreValueBytesWrittenPerSec int

	// Number of bytes read in last second from file system.
	FileBytesReadPerSec int

	// Number of bytes written in last second to file system.
	FileBytesWrittenPerSec int

	// Number of key-value GET calls in last second.
	GetsPerSec int

	// Number of key-value PUT calls in last second.
	PutsPerSec int

	// Channel to notify bytes read from a storage engine.
	StoreKeyBytesRead chan int

	// Channel to notify bytes written to a storage engine.
	StoreKeyBytesWritten chan int

	// Channel to notify bytes read from a storage engine.
	StoreValueBytesRead chan int

	// Channel to notify bytes written to a storage engine.
	StoreValueBytesWritten chan int

	// Channel to notify bytes read from file system.
	FileBytesRead chan int

	// Channel to notify bytes written to file system.
	FileBytesWritten chan int
)

Functions

func Close

func Close()

Close handles any storage-specific shutdown procedures.

func DataFromFile

func DataFromFile(filename string) ([]byte, error)

DataFromFile returns data from a file.

func DataKeyToLocalIDs

func DataKeyToLocalIDs(k Key) (dvid.InstanceID, dvid.VersionID, dvid.ClientID, error)

KeyToLocalIDs parses a key under a DataContext and returns instance, version and client ids.

func DeleteDataInstance

func DeleteDataInstance(data dvid.Data) error

DeleteDataInstance removes a data instance across all versions and tiers of storage.

func EnginesAvailable

func EnginesAvailable() string

EnginesAvailable returns a description of the available storage engines.

func Initialize

func Initialize(cmdline dvid.Config, sc *dvid.StoreConfig) (created bool, err error)

Initialize the storage systems. Returns a bool + error where the bool is true if the metadata store is newly created and needs initialization.

func RegisterEngine

func RegisterEngine(e Engine)

RegisterEngine registers an Engine for DVID use.

func Repair

func Repair(name, path string) error

Repair repairs a named engine's store at given path.

func UpdateDataKey

func UpdateDataKey(k Key, instance dvid.InstanceID, version dvid.VersionID, client dvid.ClientID) error

Types

type Batch

type Batch interface {
	// Delete removes from the batch a put using the given key.
	Delete(TKey)

	// Put adds to the batch a put using the given key-value.
	Put(k TKey, v []byte)

	// Commits a batch of operations and closes the write batch.
	Commit() error
}

Batch groups operations into a transaction. Clear() and Close() were removed due to how other key-value stores implement batches. It's easier to implement cross-database handling of a simple write/delete batch that commits then closes rather than something that clears.

type Chunk

type Chunk struct {
	*ChunkOp
	*TKeyValue
}

Chunk is the unit passed down channels to chunk handlers. Chunks can be passed from lower-level database access functions to type-specific chunk processing.

type ChunkFunc

type ChunkFunc func(*Chunk) error

ChunkFunc is a function that accepts a Chunk.

type ChunkOp

type ChunkOp struct {
	Op interface{}
	Wg *sync.WaitGroup
}

ChunkOp is a type-specific operation with an optional WaitGroup to sync mapping before reduce.

type Closer

type Closer interface {
	Close()
}

type Context

type Context interface {
	// VersionID returns the local version ID of the DAG node being operated on.
	// If not versioned, the version is the root ID.
	VersionID() dvid.VersionID

	// ConstructKey takes a type-specific key component, and generates a
	// namespaced key that fits with the DVID-wide key space partitioning.
	ConstructKey(TKey) Key

	// TKeyFromKey returns the type-specific component of the key.
	TKeyFromKey(Key) (TKey, error)

	// KeyRange returns the minimum and maximum keys for this context.
	KeyRange() (min, max Key)

	// String prints a description of the Context
	String() string

	// Returns a sync.Mutex specific to this context.
	Mutex() *sync.Mutex

	// Versioned is true if this Context is also a VersionedCtx.
	Versioned() bool
	// contains filtered or unexported methods
}

Context allows encapsulation of data that defines the partitioning of the DVID key space. To prevent conflicting implementations, Context is an opaque interface type that requires use of an implementation from the storage package, either directly or through embedding.

For a description of Go language opaque types, see the following:

http://www.onebigfluke.com/2014/04/gos-power-is-in-emergent-behavior.html

type DataContext

type DataContext struct {
	// contains filtered or unexported fields
}

DataContext supports both unversioned and versioned data persistence.

func NewDataContext

func NewDataContext(data dvid.Data, versionID dvid.VersionID) *DataContext

NewDataContext provides a way for datatypes to create a Context that adheres to DVID key space partitioning. Since Context and VersionedCtx interfaces are opaque, i.e., can only be implemented within package storage, we force compatible implementations to embed DataContext and initialize it via this function.

func (*DataContext) ClientFromKey

func (ctx *DataContext) ClientFromKey(key Key) (dvid.ClientID, error)

func (*DataContext) ConstructKey

func (ctx *DataContext) ConstructKey(tk TKey) Key

func (*DataContext) DataName

func (ctx *DataContext) DataName() dvid.InstanceName

func (*DataContext) InstanceID

func (ctx *DataContext) InstanceID() dvid.InstanceID

func (*DataContext) InstanceVersion

func (ctx *DataContext) InstanceVersion() dvid.InstanceVersion

func (*DataContext) KeyRange

func (ctx *DataContext) KeyRange() (min, max Key)

KeyRange returns the min and max full keys. The DataContext can have any version since min/max keys for a data instance is independent of the current context's version.

func (*DataContext) MaxVersionKey

func (ctx *DataContext) MaxVersionKey(tk TKey) (Key, error)

Returns upper bound key for versions of given byte slice key representation.

func (*DataContext) MinVersionKey

func (ctx *DataContext) MinVersionKey(tk TKey) (Key, error)

Returns lower bound key for versions of given byte slice key representation.

func (*DataContext) Mutex

func (ctx *DataContext) Mutex() *sync.Mutex

func (*DataContext) String

func (ctx *DataContext) String() string

func (*DataContext) TKeyFromKey

func (ctx *DataContext) TKeyFromKey(key Key) (TKey, error)

TKeyFromKey returns a type-specific key from a full key. Any DataContext is sufficient as receiver.

func (*DataContext) TombstoneKey

func (ctx *DataContext) TombstoneKey(tk TKey) Key

func (*DataContext) UnversionedKey added in v0.7.1

func (ctx *DataContext) UnversionedKey(tk TKey) (Key, dvid.VersionID, error)

UnversionedKey returns a unversioned Key and the version id as separate components. This can be useful for storage systems like column stores where the row key is the unversioned Key and the column qualifier is the version id.

func (*DataContext) VersionFromKey

func (ctx *DataContext) VersionFromKey(key Key) (dvid.VersionID, error)

VersionFromKey returns a version ID from a full key. Any DataContext is sufficient as receiver.

func (*DataContext) VersionID

func (ctx *DataContext) VersionID() dvid.VersionID

func (*DataContext) Versioned

func (ctx *DataContext) Versioned() bool

Versioned returns false. This can be overriden by embedding DataContext in structures that will support the VersionedCtx interface.

type DataStoreType

type DataStoreType uint8

DataStoreType describes the semantics of a particular data store.

const (
	UnknownData DataStoreType = iota
	MetaData
	Mutable
	Immutable
)

type Engine

type Engine interface {
	fmt.Stringer

	// GetName returns a simple identifier like "basholeveldb", "kvautobus" or "bigtable".
	GetName() string

	// GetSemVer returns the semantic versioning info.
	GetSemVer() semver.Version
}

Engine implementations can fulfill a variety of interfaces and can be checked by runtime cast checks, e.g., myGetter, ok := myEngine.(OrderedKeyValueGetter) Data types can throw a warning at init time if the backend doesn't support required interfaces, or they can choose to implement multiple ways of handling data. Each Engine implementation should call storage.Register() to register its availability.

func GetEngine

func GetEngine(name string) Engine

GetEngine returns an Engine of the given name.

type GraphDB

type GraphDB interface {
	GraphSetter
	GraphGetter

	Close()
}

GraphDB defines the entire interface that a graph database should support

func GraphStore

func GraphStore() (GraphDB, error)

type GraphGetter

type GraphGetter interface {
	// GetVertices retrieves a list of all vertices in the graph
	GetVertices(ctx Context) ([]dvid.GraphVertex, error)

	// GetEdges retrieves a list of all edges in the graph
	GetEdges(ctx Context) ([]dvid.GraphEdge, error)

	// GetVertex retrieves a vertex given a vertex id
	GetVertex(ctx Context, id dvid.VertexID) (dvid.GraphVertex, error)

	// GetVertex retrieves an edges between two vertex IDs
	GetEdge(ctx Context, id1 dvid.VertexID, id2 dvid.VertexID) (dvid.GraphEdge, error)

	// GetVertexProperty retrieves a property as a byte array given a vertex id
	GetVertexProperty(ctx Context, id dvid.VertexID, key string) ([]byte, error)

	// GetEdgeProperty retrieves a property as a byte array given an edge defined by id1 and id2
	GetEdgeProperty(ctx Context, id1 dvid.VertexID, id2 dvid.VertexID, key string) ([]byte, error)
}

GraphGetter defines operations that retrieve information from a graph

type GraphSetter

type GraphSetter interface {
	// CreateGraph creates a graph with the given context.
	CreateGraph(ctx Context) error

	// AddVertex inserts an id of a given weight into the graph
	AddVertex(ctx Context, id dvid.VertexID, weight float64) error

	// AddEdge adds an edge between vertex id1 and id2 with the provided weight
	AddEdge(ctx Context, id1 dvid.VertexID, id2 dvid.VertexID, weight float64) error

	// SetVertexWeight modifies the weight of vertex id
	SetVertexWeight(ctx Context, id dvid.VertexID, weight float64) error

	// SetEdgeWeight modifies the weight of the edge defined by id1 and id2
	SetEdgeWeight(ctx Context, id1 dvid.VertexID, id2 dvid.VertexID, weight float64) error

	// SetVertexProperty adds arbitrary data to a vertex using a string key
	SetVertexProperty(ctx Context, id dvid.VertexID, key string, value []byte) error

	// SetEdgeProperty adds arbitrary data to an edge using a string key
	SetEdgeProperty(ctx Context, id1 dvid.VertexID, id2 dvid.VertexID, key string, value []byte) error

	// RemoveVertex removes the vertex and its properties and edges
	RemoveVertex(ctx Context, id dvid.VertexID) error

	// RemoveEdge removes the edge defined by id1 and id2 and its properties
	RemoveEdge(ctx Context, id1 dvid.VertexID, id2 dvid.VertexID) error

	// RemoveGraph removes the entire graph including all vertices, edges, and properties
	RemoveGraph(ctx Context) error

	// RemoveVertexProperty removes the property data for vertex id at the key
	RemoveVertexProperty(ctx Context, id dvid.VertexID, key string) error

	// RemoveEdgeProperty removes the property data for edge at the key
	RemoveEdgeProperty(ctx Context, id1 dvid.VertexID, id2 dvid.VertexID, key string) error
}

GraphSetter defines operations that modify a graph

type ImmutableEngine

type ImmutableEngine interface {
	Engine
	NewImmutableStore(dvid.EngineConfig) (db ImmutableStorer, created bool, err error)
}

type ImmutableStorer

type ImmutableStorer interface {
	OrderedKeyValueDB
}

ImmutableStorer is the interface for immutable data storage, i.e., data stored in interior nodes of the DAG or from datatypes known to operate with immutable data, particularly during ingestion (e.g., grayscale image data). The implementation of an ImmutableStorer benefits from knowing its data is immutable, allowing better caching and handling of distributed data without worry of coordination.

NOTE: Although the interface is identical to a mutable store, its use requires an an immutable pattern, e.g., calling a second Put() on the same key should return an error.

func ImmutableStore

func ImmutableStore() (ImmutableStorer, error)

type Key

type Key []byte

Key is the slice of bytes used to store a value in a storage engine. It internally represents a number of DVID features like a data instance ID, version, and a type-specific key component.

func UnversionedKey added in v0.7.1

func UnversionedKey(k Key) (isMetadata bool, unversioned Key, v dvid.VersionID, err error)

UnversionedKey returns key components depending on whether the passed Key is a metadata or data key. If metadata, it returns the key and a 0 version id. If it is a data key, it returns the unversioned portion of the Key and the version id.

func (Key) IsTombstone

func (k Key) IsTombstone() bool

IsTombstone returns true if the given key is a tombstone key.

type KeyChan

type KeyChan chan Key

KeyChan is a channel of full (not type-specific) keys.

type KeyValue

type KeyValue struct {
	K Key
	V []byte
}

KeyValue stores a full storage key-value pair.

type KeyValueBatcher

type KeyValueBatcher interface {
	NewBatch(ctx Context) Batch
}

KeyValueBatcher allow batching operations into an atomic update or transaction. For example: "Atomic Updates" in http://leveldb.googlecode.com/svn/trunk/doc/index.html

type KeyValueDB

type KeyValueDB interface {
	fmt.Stringer
	KeyValueGetter
	KeyValueSetter
	Closer
}

KeyValueDB provides an interface to the simplest storage API: a key-value store.

type KeyValueGetter

type KeyValueGetter interface {
	// Get returns a value given a key.
	Get(ctx Context, k TKey) ([]byte, error)
}

type KeyValueSetter

type KeyValueSetter interface {
	// Put writes a value with given key in a possibly versioned context.
	Put(Context, TKey, []byte) error

	// Delete deletes a key-value pair so that subsequent Get on the key returns nil.
	// For versioned data in mutable stores, Delete() will create a tombstone for the version
	// unlike RawDelete or DeleteAll.
	Delete(Context, TKey) error

	// RawPut is a low-level function that puts a key-value pair using full keys.
	// This can be used in conjunction with RawRangeQuery.
	RawPut(Key, []byte) error

	// RawDelete is a low-level function.  It deletes a key-value pair using full keys
	// without any context.  This can be used in conjunction with RawRangeQuery.
	RawDelete(Key) error
}

type MetaDataEngine

type MetaDataEngine interface {
	Engine
	NewMetaDataStore(dvid.EngineConfig) (db MetaDataStorer, created bool, err error)
}

type MetaDataStorer

type MetaDataStorer interface {
	OrderedKeyValueDB
}

MetaDataStorer is the interface for storing DVID datastore metadata like the repositories, associated DAGs, and datatype-specific data that needs to be coordinated across front-end DVID servers. It is characterized by the following: (1) not big data, (2) ideally in memory, (3) strongly consistent across all DVID processes, e.g., all front-end DVID apps. Of all types of persistence, it should have lowest latency and smallest storage capacity.

func MetaDataStore

func MetaDataStore() (MetaDataStorer, error)

type MetadataContext

type MetadataContext struct{}

MetadataContext is an implementation of Context for MetadataContext persistence.

func NewMetadataContext

func NewMetadataContext() MetadataContext

func (MetadataContext) ConstructKey

func (ctx MetadataContext) ConstructKey(tk TKey) Key

func (MetadataContext) KeyRange

func (ctx MetadataContext) KeyRange() (min, max Key)

func (MetadataContext) Mutex

func (ctx MetadataContext) Mutex() *sync.Mutex

func (MetadataContext) String

func (ctx MetadataContext) String() string

func (MetadataContext) TKeyFromKey

func (ctx MetadataContext) TKeyFromKey(key Key) (TKey, error)

func (MetadataContext) VersionID

func (ctx MetadataContext) VersionID() dvid.VersionID

func (MetadataContext) Versioned

func (ctx MetadataContext) Versioned() bool

type MutableEngine

type MutableEngine interface {
	Engine
	NewMutableStore(dvid.EngineConfig) (db MutableStorer, created bool, err error)
}

func GetMutableEngine

func GetMutableEngine() MutableEngine

GetMutableEngine returns a Mutable engine if one has been compiled in. Returns nil if none are available.

type MutableStorer

type MutableStorer interface {
	OrderedKeyValueDB
}

MutableStorer is the interface for mutable data storage, i.e., data stored in uncommitted leaves of the DAG. The presumption is that the Mutable store will be smaller than an Immutable store, trading off $$/TB for speed to handle distributed transactions and other thorny issues when dealing with distributed, mutable data.

func MutableStore

func MutableStore() (MutableStorer, error)

type Op

type Op uint8

Op enumerates the types of single key-value operations that can be performed for storage engines.

const (
	GetOp Op = iota
	PutOp
	DeleteOp
	CommitOp
)

type OrderedKeyValueDB

type OrderedKeyValueDB interface {
	fmt.Stringer
	OrderedKeyValueGetter
	OrderedKeyValueSetter
	Closer
}

OrderedKeyValueDB addes range queries and range puts to a base KeyValueDB.

type OrderedKeyValueGetter

type OrderedKeyValueGetter interface {
	KeyValueGetter

	// GetRange returns a range of values spanning (kStart, kEnd) keys.
	GetRange(ctx Context, kStart, kEnd TKey) ([]*TKeyValue, error)

	// KeysInRange returns a range of type-specific key components spanning (kStart, kEnd).
	KeysInRange(ctx Context, kStart, kEnd TKey) ([]TKey, error)

	// SendKeysInRange sends a range of keys down a key channel.
	SendKeysInRange(ctx Context, kStart, kEnd TKey, ch KeyChan) error

	// ProcessRange sends a range of type key-value pairs to type-specific chunk handlers,
	// allowing chunk processing to be concurrent with key-value sequential reads.
	// Since the chunks are typically sent during sequential read iteration, the
	// receiving function can be organized as a pool of chunk handling goroutines.
	// See datatype/imageblk.ProcessChunk() for an example.
	ProcessRange(ctx Context, kStart, kEnd TKey, op *ChunkOp, f ChunkFunc) error

	// RawRangeQuery sends a range of full keys.  This is to be used for low-level data
	// retrieval like DVID-to-DVID communication and should not be used by data type
	// implementations if possible because each version's key-value pairs are sent
	// without filtering by the current version and its ancestor graph.  A nil is sent
	// down the channel when the range is complete.
	RawRangeQuery(kStart, kEnd Key, keysOnly bool, out chan *KeyValue) error
}

type OrderedKeyValueSetter

type OrderedKeyValueSetter interface {
	KeyValueSetter

	// Put key-value pairs.  Note that it could be more efficient to use the Batcher
	// interface so you don't have to create and keep a slice of KeyValue.  Some
	// databases like leveldb will copy on batch put anyway.
	PutRange(Context, []TKeyValue) error

	// DeleteRange removes all key-value pairs with keys in the given range.
	// If versioned data in mutable stores, this will create tombstones in the version
	// unlike RawDelete or DeleteAll.
	DeleteRange(ctx Context, kStart, kEnd TKey) error

	// DeleteAll removes all key-value pairs for the context.  If allVersions is true,
	// then all versions of the data instance are deleted.
	DeleteAll(ctx Context, allVersions bool) error
}

type RepairableEngine

type RepairableEngine interface {
	Engine
	Repair(path string) error
}

type Requirements

type Requirements struct {
	BulkIniter bool
	BulkWriter bool
	Batcher    bool
	GraphDB    bool
}

Requirements lists required backend interfaces for a type.

type TKey

type TKey []byte

TKey is the type-specific component of a key. Each data instance will insert key components into a class of TKey.

func MaxTKey

func MaxTKey(class TKeyClass) TKey

MaxTKey returns the lexicographically largest TKey for this class.

func MinTKey

func MinTKey(class TKeyClass) TKey

MinTKey returns the lexicographically smallest TKey for this class.

func NewTKey

func NewTKey(class TKeyClass, tkey []byte) TKey

func (TKey) ClassBytes

func (tk TKey) ClassBytes(class TKeyClass) ([]byte, error)

ClassBytes returns the bytes for a class of TKey, suitable for decoding by each data instance.

type TKeyClass

type TKeyClass byte

TKeyClass partitions the TKey space into a maximum of 256 classes.

type TKeyValue

type TKeyValue struct {
	K TKey
	V []byte
}

TKeyValue stores a type-specific key-value pair.

func (TKeyValue) Deserialize

func (kv TKeyValue) Deserialize(uncompress bool) (TKeyValue, error)

Deserialize returns a type key-value pair where the value has been deserialized.

type TKeyValues

type TKeyValues []TKeyValue

KeyValues is a slice of type key-value pairs that can be sorted.

func (TKeyValues) Len

func (kv TKeyValues) Len() int

func (TKeyValues) Less

func (kv TKeyValues) Less(i, j int) bool

func (TKeyValues) Swap

func (kv TKeyValues) Swap(i, j int)

type TestableEngine

type TestableEngine interface {
	Engine
	Delete(dvid.EngineConfig) error
}

TestableEngine is an engine that allows creation and deletion of some data using a name.

func GetTestableEngine

func GetTestableEngine() TestableEngine

GetTestableEngine returns a Mutable engine that is also Testable (has ability to create and delete database).

type VersionedCtx

type VersionedCtx interface {
	Context

	// UnversionedKey returns a unversioned Key and the version id
	// as separate components.  This can be useful for storage systems
	// like column stores where the row key is the unversioned Key and
	// the column qualifier is the version id.
	UnversionedKey(TKey) (Key, dvid.VersionID, error)

	// TombstoneKey takes a type-specific key component and returns a key that
	// signals a deletion of any ancestor values.  The returned key must have
	// as its last byte storage.MarkTombstone.
	TombstoneKey(TKey) Key

	// Returns lower bound key for versions.
	MinVersionKey(TKey) (Key, error)

	// Returns upper bound key for versions.
	MaxVersionKey(TKey) (Key, error)

	// VersionedKeyValue returns the key-value pair corresponding to this key's version
	// given a list of key-value pairs across many versions.  If no suitable key-value
	// pair is found, nil is returned.
	VersionedKeyValue([]*KeyValue) (*KeyValue, error)
}

VersionedCtx extends a Context with the minimal functions necessary to handle versioning in storage engines.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL