diskv

package module
v0.0.0-...-a444aaf Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 21, 2012 License: MIT Imports: 12 Imported by: 0

README

What is diskv?

Diskv (disk-vee) is a simple, persistent key-value store written in the Go language. It starts with an incredibly simple API for storing arbitrary data on a filesystem by key, and builds several layers of performance-enhancing abstraction on top. The end result is a conceptually simple, but highly performant, disk-backed storage system.

Build Status

Installing

Install Go 1, either from source or with a prepackaged binary.

$ go get -v github.com/peterbourgon/diskv

Usage

package main

import (
	"fmt"
	"github.com/peterbourgon/diskv"
)

func main() {
	// Simple transform function to put all of the data files into the root directory.
	flatTransform := func(s string) []string { return []string{""} }
	
	// Initialize a new diskv store, rooted at "my-data-dir", with a 1MB cache.
	d := diskv.New(diskv.Options{
		BasePath:     "my-data-dir",
		Transform:    flatTransform,
		CacheSizeMax: 1024 * 1024, 
	})

	// Write three bytes to the key "alpha".
	key := "alpha"
	s.Write(key, []byte{'1', '2', '3'})
	
	// Read the value back out of the store.
	value, _ := s.Read(key)
	fmt.Printf("%v\n", value)
	
	// Erase the key+value from the store (and the disk).
	s.Erase(key)
}

More complex examples can be found in the "examples" subdirectory.

Basic idea

At its core, diskv is a map of a key (string) to arbitrary data ([]byte). The data is written to a single file on disk, with the same name as the key. The key determines where that file will be stored, via a user-provided TransformFunc, which takes a key and returns a slice ([]string) corresponding to a path list where the key file will be stored. The simplest TransformFunc,

func SimpleTransform (key string) []string {
    return []string{} // equivalent to []string{""}
}

will place all keys in the same, base directory. The design is inspired by Redis diskstore; a TransformFunc which emulates the default diskstore behavior is available in the content-addressable-storage example.

Probably the most important design principle behind diskv is that your data is always flatly available on the disk. diskv will never do anything that would prevent you from accessing, copying, backing up, or otherwise interacting with your data via common UNIX commandline tools.

Adding a cache

An in-memory caching layer is provided by combining the BasicStore functionality with a simple map structure, and keeping it up-to-date as appropriate. Since the map structure in Go is not threadsafe, it's combined with a RWMutex to provide safe concurrent access.

Adding order

diskv is a key-value store and therefore inherently unordered. An ordering system can be injected into the store by passing something which satisfies the diskv.Index interface. (A default implementation, using Petar Maymounkov's LLRB tree, is provided.) Basically, diskv keeps an ordered (by a user-provided Less function) index of the keys, which can be queried.

Adding compression

Something which implements the diskv.Compression interface may be passed during store creation, so that all Writes and Reads are filtered through a compression/decompression pipeline. Several default implementations, using stdlib compression algorithms, are provided.

Future plans

  • Needs plenty of robust testing: huge datasets, etc...
  • More thorough benchmarking
  • Your suggestions for use-cases I haven't thought of

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Compression

type Compression interface {
	Compress(dst io.Writer, src io.Reader) error
	Decompress(dst io.Writer, src io.Reader) error
}

Compression defines an interface that Diskv uses to implement compression of data. You may define these methods on your own type, or use one of the NewCompression helpers.

func NewGzipCompression

func NewGzipCompression() Compression

func NewGzipCompressionLevel

func NewGzipCompressionLevel(level int) Compression

func NewZlibCompression

func NewZlibCompression() Compression

func NewZlibCompressionLevel

func NewZlibCompressionLevel(level int) Compression

func NewZlibCompressionLevelDict

func NewZlibCompressionLevelDict(level int, dict []byte) Compression

type Diskv

type Diskv struct {
	sync.RWMutex
	Options
	// contains filtered or unexported fields
}

Diskv implements the Diskv interface. You shouldn't construct Diskv structures directly; instead, use the New constructor.

func New

func New(options Options) *Diskv

New returns an initialized Diskv structure, ready to use. If the path identified by baseDir already contains data, it will be accessible, but not yet cached.

func (*Diskv) Erase

func (d *Diskv) Erase(key string) error

Erase synchronously erases the given key from the disk and the cache.

func (*Diskv) EraseAll

func (d *Diskv) EraseAll() error

EraseAll will delete all of the data from the store, both in the cache and on the disk. Note that EraseAll doesn't distinguish diskv-related data from non-diskv-related data. Care should be taken to always specify a diskv base directory that is exclusively for diskv data.

func (*Diskv) Keys

func (d *Diskv) Keys() <-chan string

Keys returns a channel that will yield every key accessible by the store in undefined order.

func (*Diskv) Read

func (d *Diskv) Read(key string) ([]byte, error)

Read reads the key and returns the value. If the key is available in the cache, Read won't touch the disk. If the key is not in the cache, Read will have the side-effect of lazily caching the value.

func (*Diskv) Write

func (d *Diskv) Write(key string, val []byte) error

Write synchronously writes the key-value pair to disk, making it immediately available for reads. Write relies on the filesystem to perform an eventual sync to physical media. If you need stronger guarantees, use WriteAndSync.

func (*Diskv) WriteAndSync

func (d *Diskv) WriteAndSync(key string, val []byte) error

WriteAndSync does the same thing as Write, but explicitly calls Sync on the relevant file descriptor.

type GenericCompression

type GenericCompression struct {
	// contains filtered or unexported fields
}

A GenericCompression implements Diskv's Compression interface. Users must supply it with two functions: a WriterFunc, which wraps an io.Writer with a compression layer, and a ReaderFunc, which wraps an io.Reader with a decompression layer.

func NewCompression

func NewCompression(wf WriterFunc, rf ReaderFunc) *GenericCompression

NewCompression returns a GenericCompression from the passed Writer and Reader functions, which you may supply directly.

You may also use one of the NewCompression helpers, which automatically provide Writer and Reader functions for some of the stdlib compression algorithms.

func (*GenericCompression) Compress

func (c *GenericCompression) Compress(dst io.Writer, src io.Reader) error

func (*GenericCompression) Decompress

func (c *GenericCompression) Decompress(dst io.Writer, src io.Reader) error

type Index

type Index interface {
	Initialize(less LessFunction, keys <-chan string)
	Insert(key string)
	Delete(key string)
	Keys(from string, n int) <-chan string
}

Index is a generic interface for things that can provide an ordered list of keys.

type LLRBIndex

type LLRBIndex struct {
	sync.RWMutex
	// contains filtered or unexported fields
}

LLRBIndex is an implementation of the Index interface using Petar Maymounkov's LLRB tree.

func (*LLRBIndex) Delete

func (i *LLRBIndex) Delete(key string)

Delete removes the given key (only) from the LLRB tree.

func (*LLRBIndex) Initialize

func (i *LLRBIndex) Initialize(less LessFunction, keys <-chan string)

Initialize populates the LLRB tree with data from the keys channel, according to the passed less function. It's destructive to the LLRBIndex.

func (*LLRBIndex) Insert

func (i *LLRBIndex) Insert(key string)

Insert inserts the given key (only) into the LLRB tree.

func (*LLRBIndex) Keys

func (i *LLRBIndex) Keys(from string, n int) <-chan string

Keys yields a maximum of n keys on the returned channel, in order. It's designed to effect a simple "pagniation" of keys.

If the passed 'from' key is empty, Keys will return the first n keys. If the passed 'from' key is non-empty, the first key in the returned slice will be the key that immediately follows the passed key, in key order.

type LessFunction

type LessFunction func(string, string) bool

LessFunction is used to initialize an Index of keys in a specific order.

type Options

type Options struct {
	BasePath     string
	Transform    TransformFunction
	CacheSizeMax uint64 // bytes
	PathPerm     os.FileMode
	FilePerm     os.FileMode

	Index     Index
	IndexLess LessFunction

	Compression Compression
}

Options define a set of properties that dictate Diskv behavior. All values are optional.

type ReaderFunc

type ReaderFunc func(r io.Reader) (io.ReadCloser, error)

ReaderFunc yields an io.ReadCloser which should perform decompression from the passed io.Reader.

type TransformFunction

type TransformFunction func(s string) []string

A TransformFunc transforms a key into a slice of strings, with each element in the slice representing a directory in the file path where the key's entry will eventually be stored.

For example, if TransformFunc transforms "abcdef" to ["ab", "cde", "f"], the final location of the data file will be <basedir>/ab/cde/f/abcdef

type WriterFunc

type WriterFunc func(w io.Writer) (io.WriteCloser, error)

WriterFunc yields an io.WriteCloser which should perform compression into the passed io.Writer.

Directories

Path Synopsis
examples

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL