storage

package
v0.14.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 11, 2022 License: MIT Imports: 4 Imported by: 23

Documentation

Overview

The storage package contains interfaces for storage systems, and functions for using them.

These are very low-level storage primitives. The interfaces here deal only with raw keys and raw binary blob values.

In IPLD, you can often avoid dealing with storage directly yourself, and instead use linking.LinkSystem to handle serialization, hashing, and storage all at once. (You'll hand some values that match interfaces from this package to LinkSystem when configuring it.) It's probably best to work at that level and above as much as possible. If you do need to interact with storage more directly, the read on.

The most basic APIs are ReadableStorage and WritableStorage. When writing code that works with storage systems, these two interfaces should be seen in almost all situations: user code is recommended to think in terms of these types; functions provided by this package will accept parameters of these types and work on them; implementations are expected to provide these types first; and any new library code is recommended to keep with the theme: use these interfaces preferentially.

Users should decide which actions they want to take using a storage system, find the appropriate function in this package (n.b., package function -- not a method on an interface! You will likely find one of each, with the same name: pick the package function!), and use that function, providing it the storage system (e.g. either ReadableStorage, WritableStorage, or sometimes just Storage) as a parameter. That function will then use feature-detection (checking for matches to the other, more advanced and more specific interfaces in this package) and choose the best way to satisfy the request; or, if it can't feature-detect any relevant features, the function will fall back to synthesizing the requested behavior out of the most basic API. Using the package functions, and letting them do the feature detection for you, should provide the most consistent user experience and minimize the amount of work you need to do. (Bonus: It also gives us a convenient place to smooth out any future library migrations for you!)

If writing new APIs that are meant to work reusably for any storage implementation: APIs should usually be designed around accepting ReadableStorage or WritableStorage as parameters (depending on which direction of data flow the API is regarding). and use the other interfaces (e.g. StreamingReadableStorage) thereafter internally for feature detection. For APIs which may sometimes be found relating to either a read or a write direction of data flow, the Storage interface may be used in order to define a function that should accept either ReadableStorage or WritableStorage. In other words: when writing reusable APIs, one should follow the same pattern as this package's own functions do.

Similarly, implementers of storage systems should always implement either ReadableStorage or WritableStorage first. Only after satisfying one of those should the implementation then move on to further supporting additional interfaces in this package (all of which are meant to support feature-detection). Beyond one of the basic two, all the other interfaces are optional: you can implement them if you want to advertise additional features, or advertise fastpaths that your storage system supports; but you don't have implement any of those additional interfaces if you don't want to, or if your implementation can't offer useful fastpaths for them.

Storage systems as described by this package are allowed to make some interesting trades. Generally, write operations are allowed to be first-write-wins. Furthermore, there is no requirement that the system return an error if a subsequent write to the same key has different content. These rules are reasonable for a content-addressed storage system, and allow great optimizations to be made.

Note that all of the interfaces in this package only use types that are present in the golang standard library. This is intentional, and was done very carefully. If implementing a storage system, you should find it possible to do so *without* importing this package. Because only standard library types are present in the interface contracts, it's possible to implement types that align with the interfaces without refering to them.

Note that where keys are discussed in this package, they use the golang string type -- however, they may be binary. (The golang string type allows arbitrary bytes in general, and here, we both use that, and explicitly disavow the usual "norm" that the string type implies UTF-8. This is roughly the same as the practical truth that appears when using e.g. os.OpenFile and other similar functions.) If you are creating a storage implementation where the underlying medium does not support arbitrary binary keys, then it is strongly recommend that your storage implementation should support being configured with an "escaping function", which should typically simply be of the form `func(string) string`. Additional, your storage implementation's documentation should also clearly describe its internal limitations, so that users have enough information to write an escaping function which maps their domain into the domain your storage implementation can handle.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Get added in v0.14.0

func Get(ctx context.Context, store ReadableStorage, key string) ([]byte, error)

func GetStream added in v0.14.0

func GetStream(ctx context.Context, store ReadableStorage, key string) (io.ReadCloser, error)

GetStream returns a streaming reader. This function will feature-detect the StreamingReadableStorage interface, and use that if possible; otherwise it will fall back to using basic ReadableStorage methods transparently (at the cost of loading all the data into memory at once and up front).

func Has added in v0.14.0

func Has(ctx context.Context, store Storage, key string) (bool, error)

func Peek added in v0.14.0

func Peek(ctx context.Context, store ReadableStorage, key string) ([]byte, io.Closer, error)

Peek accessess the same data as Get, but indicates that the caller promises not to mutate the returned byte slice. (By contrast, Get is expected to return a safe copy.) This function will feature-detect the PeekableStorage interface, and use that if possible; otherwise it will fall back to using basic ReadableStorage methods transparently (meaning that a no-copy fastpath simply wasn't available).

An io.Closer is returned along with the byte slice. The Close method on the Closer must be called when the caller is done with the byte slice; otherwise, memory leaks may result. (Implementers of this interface may be expecting to reuse the byte slice after Close is called.)

func Put added in v0.14.0

func Put(ctx context.Context, store WritableStorage, key string, content []byte) error

func PutStream added in v0.14.0

func PutStream(ctx context.Context, store WritableStorage) (io.Writer, func(key string) error, error)

PutStream returns an io.Writer and a WriteCommitter callback. (See the docs on StreamingWritableStorage.PutStream for details on what that means.) This function will feature-detect the StreamingWritableStorage interface, and use that if possible; otherwise it will fall back to using basic WritableStorage methods transparently (at the cost of needing to buffer all of the content in memory while the write is in progress).

func PutVec added in v0.14.0

func PutVec(ctx context.Context, store WritableStorage, key string, blobVec [][]byte) error

PutVec is an API for writing several slices of bytes at once into storage. This kind of API can be useful for maximizing performance in scenarios where data is already loaded completely into memory, but scattered across several non-contiguous regions. This function will feature-detect the VectorWritableStorage interface, and use that if possible; otherwise it will fall back to using StreamingWritableStorage, or failing that, fall further back to basic WritableStorage methods, transparently.

Types

type PeekableStorage added in v0.14.0

type PeekableStorage interface {
	Peek(ctx context.Context, key string) ([]byte, io.Closer, error)
}

PeekableStorage is a feature-detection interface which a storage implementation can use to advertise the ability to look at a piece of data, and return it in shared memory. The PeekableStorage.Peek method is essentially the same as ReadableStorage.Get -- but by contrast, ReadableStorage is expected to return a safe copy. PeekableStorage can be used when the caller knows they will not mutate the returned slice.

An io.Closer is returned along with the byte slice. The Close method on the Closer must be called when the caller is done with the byte slice; otherwise, memory leaks may result. (Implementers of this interface may be expecting to reuse the byte slice after Close is called.)

Note that Peek does not imply that the caller can use the byte slice freely; doing so may result in storage corruption or other undefined behavior.

type ReadableStorage added in v0.14.0

type ReadableStorage interface {
	Storage
	Get(ctx context.Context, key string) ([]byte, error)
}

ReadableStorage is one of the base interfaces in the storage APIs; a storage system should implement at minimum either this, or WritableStorage, depending on whether it supports reading or writing. (One type may also implement both.)

ReadableStorage implementations must at minimum provide a way to ask the store whether it contains a key, and a way to ask it to return the value.

Library functions that work with storage systems should take either ReadableStorage, or WritableStorage, or Storage, as a parameter, depending on whether the function deals with the reading of data, or the writing of data, or may be found on either, respectively.

An implementation of ReadableStorage may also support many other methods -- for example, it may additionally match StreamingReadableStorage, or yet more interfaces. Usually, you should not need to check for this yourself; instead, you should use the storage package's functions to ask for the desired mode of interaction. Those functions will will accept any ReadableStorage as an argument, detect the additional interfaces automatically and use them if present, or, fall back to synthesizing equivalent behaviors from the basics. See the package-wide docs for more discussion of this design.

type Storage added in v0.14.0

type Storage interface {
	Has(ctx context.Context, key string) (bool, error)
}

Storage is one of the base interfaces in the storage APIs. This type is rarely seen by itself alone (and never useful to implement alone), but is included in both ReadableStorage and WritableStorage. Because it's included in both the of the other two useful base interfaces, you can define functions that work on either one of them by using this type to describe your function's parameters.

Library functions that work with storage systems should take either ReadableStorage, or WritableStorage, or Storage, as a parameter, depending on whether the function deals with the reading of data, or the writing of data, or may be found on either, respectively.

An implementation of Storage may also support many other methods. At the very least, it should also support one of either ReadableStorage or WritableStorage. It may support even more interfaces beyond that for additional feature detection. See the package-wide docs for more discussion of this design.

The Storage interface does not include much of use in itself alone, because ReadableStorage and WritableStorage are meant to be the most used types in declarations. However, it does include the Has function, because that function is reasonable to require ubiquitously from all implementations, and it serves as a reasonable marker to make sure the Storage interface is not trivially satisfied.

type StreamingReadableStorage added in v0.14.0

type StreamingReadableStorage interface {
	GetStream(ctx context.Context, key string) (io.ReadCloser, error)
}

type StreamingWritableStorage added in v0.14.0

type StreamingWritableStorage interface {
	PutStream(ctx context.Context) (io.Writer, func(key string) error, error)
}

StreamingWritableStorage is a feature-detection interface that advertises support for streaming writes. It is normal for APIs to use WritableStorage in their exported API surface, and then internally check if that value implements StreamingWritableStorage if they wish to use streaming operations.

Streaming writes can be preferable to the all-in-one style of writing of WritableStorage.Put, because with streaming writes, the high water mark for memory usage can be kept lower. On the other hand, streaming writes can incur slightly higher allocation counts, which may cause some performance overhead when handling many small writes in sequence.

The PutStream function returns three parameters: an io.Writer (as you'd expect), another function, and an error. The function returned is called a "WriteCommitter". The final error value is as usual: it will contain an error value if the write could not be begun. ("WriteCommitter" will be refered to as such throughout the docs, but we don't give it a named type -- unfortunately, this is important, because we don't want to force implementers of storage systems to import this package just for a type name.)

The WriteCommitter function should be called when you're done writing, at which time you give it the key you want to commit the data as. It will close and flush any streams, and commit the data to its final location under this key. (If the io.Writer is also an io.WriteCloser, it is not necessary to call Close on it, because using the WriteCommiter will do this for you.)

Because these storage APIs are meant to work well for content-addressed systems, the key argument is not provided at the start of the write -- it's provided at the end. (This gives the opportunity to be computing a hash of the contents as they're written to the stream.)

As a special case, giving a key of the zero string to the WriteCommiter will instead close and remove any temp files, and store nothing. An error may still be returned from the WriteCommitter if there is an error cleaning up any temporary storage buffers that were created.

Continuing to write to the io.Writer after calling the WriteCommitter function will result in errors. Calling the WriteCommitter function more than once will result in errors.

type VectorWritableStorage added in v0.14.0

type VectorWritableStorage interface {
	PutVec(ctx context.Context, key string, blobVec [][]byte) error
}

VectorWritableStorage is an API for writing several slices of bytes at once into storage. It's meant a feature-detection interface; not all storage implementations need to provide this feature. This kind of API can be useful for maximizing performance in scenarios where data is already loaded completely into memory, but scattered across several non-contiguous regions.

type WritableStorage added in v0.14.0

type WritableStorage interface {
	Storage
	Put(ctx context.Context, key string, content []byte) error
}

WritableStorage is one of the base interfaces in the storage APIs; a storage system should implement at minimum either this, or ReadableStorage, depending on whether it supports reading or writing. (One type may also implement both.)

WritableStorage implementations must at minimum provide a way to ask the store whether it contains a key, and a way to put a value into storage indexed by some key.

Library functions that work with storage systems should take either ReadableStorage, or WritableStorage, or Storage, as a parameter, depending on whether the function deals with the reading of data, or the writing of data, or may be found on either, respectively.

An implementation of WritableStorage may also support many other methods -- for example, it may additionally match StreamingWritableStorage, or yet more interfaces. Usually, you should not need to check for this yourself; instead, you should use the storage package's functions to ask for the desired mode of interaction. Those functions will will accept any WritableStorage as an argument, detect the additional interfaces automatically and use them if present, or, fall back to synthesizing equivalent behaviors from the basics. See the package-wide docs for more discussion of this design.

Directories

Path Synopsis
bsadapter module
bsrvadapter module
dsadapter module
This package contains several useful readymade sharding functions, which should plug nicely into most storage implementations.
This package contains several useful readymade sharding functions, which should plug nicely into most storage implementations.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL