items

package
v0.0.0-...-bd9ca74 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 27, 2021 License: Apache-2.0 Imports: 14 Imported by: 0

Documentation

Overview

Package items provides routines to manipulate and serialize items. An item is defined as a collection of blobs and versions. Blobs are immutable binary blocks of data. Blobs may be added to an item over time, and may be deleted, but they cannot be otherwise altered once added. Versions provide a way to associate labels to blobs, and may be added to an item over time, but cannot be deleted.

Items are serialized into a sequence of bundles. Each bundle file is immutable. However, like blobs, bundle files may be deleted.

Items do not share blobs between them. Bundle files do not contain information from more than one item.

A Store provides the logic to do the serialization and deserialization of items to bundles. It wraps a store.Store interface. It will block. It is possible to add a cache to store item metadata. The data retrieval paths of a store are safe to be accessed from multiple goroutines. However, an open Writer for any given item should only be used by one gouroutine at a time.

Both blobs and versions are numbered sequentially starting from 1.

An item's metadata and blobs are grouped into bundles, which are zip files. Each bundle contains the complete up-to-date metadata information on an item, as well as zero or more blobs. Bundles are numbered, but they should not be assumed to be numbered sequentially since deletions may remove some bundles. Bundle numbers for an item start from 1. The largest numbered bundle must contain the most up-to-date information on the item, including the (correct!) blob to bundle mapping.

There is no relationship between a bundle number and the versions of an item.

Index

Constants

View Source
const (
	// MB is the number of bytes in one megabyte (we use base 10)
	MB = 1000000

	// IdealBundleSize is a cutoff, and new bundle files will be started
	// once the current one grows past this. (only checked when starting
	// as new blob.)
	IdealBundleSize = 500 * MB
)

Variables

View Source
var (
	// ErrNoItem occurs when an item is requested for which no bundle
	// files could be found in the backing store.
	ErrNoItem = errors.New("no item, bad item id")
	// ErrNoStore occurs when useStore has been set to false-
	// backing store is unavailable.
	ErrNoStore = errors.New("no item, item store unavailable")
	// ErrDeleted occurs when content that has been deleted is requested
	ErrDeleted = errors.New("Blob has been deleted")
)
View Source
var (
	// ErrNotFound means a stream inside a zip file with the given name
	// could not be found.
	ErrNotFound = errors.New("stream not found")
)
View Source
var Nullcache cache

The Nullcache is an ItemCache which does not store anything.

Functions

func OpenBundleStream

func OpenBundleStream(s store.Store, key, sname string) (io.ReadCloser, error)

OpenBundleStream returns an io.ReadCloser containing the contents of the stream sname inside the bundle having the given key in the given store.

func ValidateWriteBlob

func ValidateWriteBlob(itemID string, blob *Blob, result Results) error

ValidateWriteBlob checks that the correct number of bytes was written and that the written hashes match the expected hashes. Returns nil if everything is good. Otherwise returns an error.

Types

type BagreaderCloser

type BagreaderCloser struct {
	*bagit.Reader // the zip reader
	// contains filtered or unexported fields
}

A BagreaderCloser is a bagit.Reader which will also close the underlying file.

func OpenBundle

func OpenBundle(s store.Store, key string) (*BagreaderCloser, error)

OpenBundle opens the provided key in the given store, and wraps it in a bagit reader.

func (*BagreaderCloser) Close

func (bg *BagreaderCloser) Close() error

Close flushes the reader and closes the underlying io.Closer.

type Blob

type Blob struct {
	ID       BlobID
	SaveDate time.Time
	Creator  string
	Size     int64 // logical size of associated content (i.e. before compression)

	// following valid if blob is NOT deleted
	Bundle   int    // which bundle file this blob is stored in, 0 if deleted
	MD5      []byte // unused if deleted
	SHA256   []byte // unused if deleted
	MimeType string // either empty or the mime type of this blob

	// following valid if blob is deleted
	DeleteDate time.Time // zero iff not deleted
	Deleter    string    // empty iff not deleted
	DeleteNote string    // optional note for deletion event
}

Blob records metadata for each blob.

type BlobID

type BlobID int

BlobID identifies a single blob within an item

type BundleWriter

type BundleWriter struct {
	// contains filtered or unexported fields
}

BundleWriter helps with saving blobs into bundles, and with repackaging blobs when doing deletions. It keeps a reference to its source item, and will use that to save the item-info.json file when needed.

It is not goroutine safe. Make sure to call Close when finished.

func NewBundler

func NewBundler(s store.Store, item *Item) *BundleWriter

NewBundler starts a new bundle writer for the given item. More than one bundle file may be written. The advancement to a new bundle file happens either when the current one grows larger than IdealBundleSize, or when Next() is called.

func (*BundleWriter) Close

func (bw *BundleWriter) Close() error

Close writes out any final metadata and closes the current bundle.

func (*BundleWriter) CopyBundleExcept

func (bw *BundleWriter) CopyBundleExcept(src int, except []BlobID) error

CopyBundleExcept copies all the blobs in the bundle src, except for those in the list, into the current place in the bundle writer.

func (*BundleWriter) CurrentBundle

func (bw *BundleWriter) CurrentBundle() int

CurrentBundle returns the id of the bundle being written to.

func (*BundleWriter) Next

func (bw *BundleWriter) Next() error

Next closes the current bundle, if any, and starts a new bundle file.

func (*BundleWriter) WriteBlob

func (bw *BundleWriter) WriteBlob(blob *Blob, r io.Reader) (Results, error)

WriteBlob writes the given blob into the bundle.

WriteBlob first sees if it needs to start a new bundle file based on the number of bytes already written into the current bundle. At the end of the call, CurrentBundle() returns the bundle the blob was written into.

If WrittenMD5 is empty, then the file was not created in the bundle.

The *Blob is not modified and no validation of the write is performed. Use ValidateWriteBlob() to do validation of the returned Results with the expected values in the *Blob.

type Item

type Item struct {
	ID        string
	MaxBundle int        // largest bundle id used by this item
	Blobs     []*Blob    // list of blobs, sorted by id
	Versions  []*Version // list of versions, sorted by id
}

An Item contains the information for a single item.

func (Item) BlobByExtendedSlot

func (item Item) BlobByExtendedSlot(slot string) BlobID

BlobByExtendedSlot return the blob idenfifer for the given extended slot name. An extended slot name is a slot name with an optional "@nnn/" prefix, where nnn is the version number of the item to use (in decimal). If a version prefix is not present, the most recent version of the item is used. Like BlobByVersionSlot, 0 is returned if the slot path does not resolve to anything.

func (Item) BlobByVersionSlot

func (item Item) BlobByVersionSlot(vid VersionID, slot string) BlobID

BlobByVersionSlot returns the blob corresponding to the given version identifier and slot name. It returns 0 if the (version id, slot) pair do not resolve to anything.

type ItemCache

type ItemCache interface {
	// try to return an item record with the given id.
	// return nil if there is nothing matching in the cache.
	Lookup(id string) *Item

	Set(id string, item *Item)
}

An ItemCache defines the methods a Store will use to interact with a cache.

func NewMemoryCache

func NewMemoryCache() ItemCache

NewMemoryCache returns an empty ItemCache that keeps everything in memory and never evicts anything. It is probably only useful in tests.

type NoBlobError

type NoBlobError struct {
	ID  string
	BID BlobID
}

func (NoBlobError) Error

func (err NoBlobError) Error() string

type Results

type Results struct {
	BytesWritten  int64
	Bundle        int
	WrittenMD5    []byte
	WrittenSHA256 []byte
}

Results is used to return info from BundleWriter.WriteBlob(). Both WrittenMD5 and WrittenSHA256 are empty if nothing was written.

type Store

type Store struct {
	S store.Store // the underlying bundle store
	// contains filtered or unexported fields
}

A Store holds a collection of items

func New

func New(s store.Store) *Store

New creates a new item store which writes its bundles to the given store.Store.

func NewWithCache

func NewWithCache(s store.Store, cache ItemCache) *Store

NewWithCache creates a new item store which caches the item metadata in the given cache. (Should be deprecated??)

func (*Store) Blob

func (s *Store) Blob(id string, bid BlobID) (io.ReadCloser, int64, error)

Blob returns an io.ReadCloser containing the given blob's contents and the blob's size. It will block until the item and blob are loaded from the backing store.

TODO: perhaps this should be moved to be a method on an Item*

func (*Store) BlobInfo

func (s *Store) BlobInfo(id string, bid BlobID) (*Blob, error)

BlobInfo returns a pointer to a Blob structure containing information on the given blob. It is like Blob() but doesn't recall the content from tape. Unlike Blob(), though, it will not return an error if the blob is deleted.

func (*Store) Item

func (s *Store) Item(id string) (*Item, error)

Item loads and return an item's metadata info. This will block until the item is loaded.

func (*Store) List

func (s *Store) List() <-chan string

List returns a channel which will contain all of the item ids in the current store.

func (*Store) Open

func (s *Store) Open(id string, creator string) (*Writer, error)

Open opens the item id for writing. This will add a single new version to the item. New blobs can be written. Blobs can also be deleted (but that is not a quick operation).

The creator is the name of the agent performing these updates.

It is an error for more than one goroutine to open the same item at a time. This does not perform any locking itself.

func (*Store) SetCache

func (s *Store) SetCache(cache ItemCache)

SetCache will set the metadata cache used. It is intended to be used during initialization. It will cause a race condition if used while others are accessing this item store.

func (*Store) SetUseStore

func (s *Store) SetUseStore(value bool)

SetUseStore enables or disables access to the underlying store. true- on/ false-off

func (*Store) Validate

func (s *Store) Validate(id string) (nb int64, problems []string, err error)

Validate the given item. Returns the total amount checksummed (in bytes), a list of issues which will be empty if everything is fine, and an error if an error happened during the validation. In particular, err does not show validation errors, only if a system error happened while validating.

Things checked (not all are implemented yet): * Each blob has the correct checksum * Each blob appears in exactly one bundle * Every blob is assigned to at least one slot in at least one version * Each slot points to an existing (possibly deleted) blob * Each bundle is readable and in the correct format * There are no extra files in a bundle * All required metadata fields are present for each blob * All required metadata fields are present for each version

This is a method on the Store instead of an Item since it needs access to the underlying bundle files.

type Version

type Version struct {
	ID       VersionID
	SaveDate time.Time
	Creator  string
	Note     string
	Slots    map[string]BlobID
}

Version contains the metadata on a single item version.

type VersionID

type VersionID int

VersionID identifies a version of an item

type Writer

type Writer struct {
	// contains filtered or unexported fields
}

A Writer implements an io.Writer with extra methods to save a new version of an Item.

func (*Writer) ClearSlots

func (wr *Writer) ClearSlots()

ClearSlots will remove all the slot information for the current version. Any slot entries made before calling this will be lost (but the blobs will still be around!).

func (*Writer) Close

func (wr *Writer) Close() error

Close closes the given Writer. The final metadata is written out, and any blobs marked for deletion are extracted and removed.

func (*Writer) DeleteBlob

func (wr *Writer) DeleteBlob(bid BlobID)

DeleteBlob marks the given blob for removal from the underlying storage. Blobs will be removed when Close() is called. Removal may take a while since every other blob in the bundle the blob is stored in will be copied into a new bundle.

This function should be used infrequently. What is probably desired is to make a new version with the given slot removed by calling SetSlot with a 0 as a blob id.

func (*Writer) SetCreator

func (wr *Writer) SetCreator(s string)

SetCreator sets the creator metadata field. (Remove?)

func (*Writer) SetMimeType

func (wr *Writer) SetMimeType(id BlobID, mimetype string)

SetMimeType sets the mime type for the given blob. Nothing is changed if no blob has the given id or if the blob has been deleted.

func (*Writer) SetNote

func (wr *Writer) SetNote(s string)

SetNote sets the note metadata field for this version.

func (*Writer) SetSlot

func (wr *Writer) SetSlot(s string, id BlobID)

SetSlot adds a slot mapping for this version. To explicitly remove a slot, set it to 0. The slot mapping is initialized to that of the previous version.

func (*Writer) WriteBlob

func (wr *Writer) WriteBlob(r io.Reader, size int64, md5, sha256 []byte) (BlobID, error)

WriteBlob signifies the intent to copy the given io.Reader into this item. If size and the hashes are provided, the item is checked to see if there is already a blob with them in this item. If there is, that blob id is returned and r is not read at all.

If such a blob is not already in the item, WriteBlob will copy the io.Reader into the item as a new blob. The hashes and size are compared with the data read from r and an error is triggered if there is a difference.

The hashes and size may be nil and 0 if unknown, in which case they will be calculated and stored as needed, and no mismatch error will be triggered.

If there is an error writing the blob, the blob is not added to the item's blob list, and the id of 0 is returned. There may be a remnant "blob/{id}" entry in the zip file, so it is best to close this Writer and reopen before retrying writing the blob.

type Zipwriter

type Zipwriter struct {
	*bagit.Writer // the zip interface over the bundle file
	// contains filtered or unexported fields
}

A Zipwriter wraps the zip.Writer object to track the underlying file stream holding the zip file's complete contents. Some utility methods are added to make our life easier.

func OpenZipWriter

func OpenZipWriter(s store.Store, id string, n int) (*Zipwriter, error)

OpenZipWriter creates a new bundle in the given store using the given id and bundle number. It returns a zip writer which is then saved into the store.

func (*Zipwriter) Close

func (zw *Zipwriter) Close() error

Close writes out the zip directory information and then closes the underlying file descriptor for this bundle file.

func (*Zipwriter) MakeStream

func (zw *Zipwriter) MakeStream(name string) (io.Writer, error)

MakeStream returns a writer which saves a file with the given name inside this zip file. The writer does not need to be closed when finished. Only one stream can be active at a time, and call MakeStream again to start the next stream.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL