Documentation ¶
Overview ¶
Package items provides routines to manipulate and serialize items. An item is defined as a collection of blobs and versions. Blobs are immutable binary blocks of data. Blobs may be added to an item over time, and may be deleted, but they cannot be otherwise altered once added. Versions provide a way to associate labels to blobs, and may be added to an item over time, but cannot be deleted.
Items are serialized into a sequence of bundles. Each bundle file is immutable. However, like blobs, bundle files may be deleted.
Items do not share blobs between them. Bundle files do not contain information from more than one item.
A Store provides the logic to do the serialization and deserialization of items to bundles. It wraps a store.Store interface. It will block. It is possible to add a cache to store item metadata. The data retrieval paths of a store are safe to be accessed from multiple goroutines. However, an open Writer for any given item should only be used by one gouroutine at a time.
Both blobs and versions are numbered sequentially starting from 1.
An item's metadata and blobs are grouped into bundles, which are zip files. Each bundle contains the complete up-to-date metadata information on an item, as well as zero or more blobs. Bundles are numbered, but they should not be assumed to be numbered sequentially since deletions may remove some bundles. Bundle numbers for an item start from 1. The largest numbered bundle must contain the most up-to-date information on the item, including the (correct!) blob to bundle mapping.
There is no relationship between a bundle number and the versions of an item.
Index ¶
- Constants
- Variables
- func OpenBundleStream(s store.Store, key, sname string) (io.ReadCloser, error)
- func ValidateWriteBlob(itemID string, blob *Blob, result Results) error
- type BagreaderCloser
- type Blob
- type BlobID
- type BundleWriter
- type Item
- type ItemCache
- type NoBlobError
- type Results
- type Store
- func (s *Store) Blob(id string, bid BlobID) (io.ReadCloser, int64, error)
- func (s *Store) BlobInfo(id string, bid BlobID) (*Blob, error)
- func (s *Store) Item(id string) (*Item, error)
- func (s *Store) List() <-chan string
- func (s *Store) Open(id string, creator string) (*Writer, error)
- func (s *Store) SetCache(cache ItemCache)
- func (s *Store) SetUseStore(value bool)
- func (s *Store) Validate(id string) (nb int64, problems []string, err error)
- type Version
- type VersionID
- type Writer
- func (wr *Writer) ClearSlots()
- func (wr *Writer) Close() error
- func (wr *Writer) DeleteBlob(bid BlobID)
- func (wr *Writer) SetCreator(s string)
- func (wr *Writer) SetMimeType(id BlobID, mimetype string)
- func (wr *Writer) SetNote(s string)
- func (wr *Writer) SetSlot(s string, id BlobID)
- func (wr *Writer) WriteBlob(r io.Reader, size int64, md5, sha256 []byte) (BlobID, error)
- type Zipwriter
Constants ¶
const ( // MB is the number of bytes in one megabyte (we use base 10) MB = 1000000 // IdealBundleSize is a cutoff, and new bundle files will be started // once the current one grows past this. (only checked when starting // as new blob.) IdealBundleSize = 500 * MB )
Variables ¶
var ( // ErrNoItem occurs when an item is requested for which no bundle // files could be found in the backing store. ErrNoItem = errors.New("no item, bad item id") // ErrNoStore occurs when useStore has been set to false- // backing store is unavailable. ErrNoStore = errors.New("no item, item store unavailable") // ErrDeleted occurs when content that has been deleted is requested ErrDeleted = errors.New("Blob has been deleted") )
var ( // ErrNotFound means a stream inside a zip file with the given name // could not be found. ErrNotFound = errors.New("stream not found") )
var Nullcache cache
The Nullcache is an ItemCache which does not store anything.
Functions ¶
func OpenBundleStream ¶
OpenBundleStream returns an io.ReadCloser containing the contents of the stream sname inside the bundle having the given key in the given store.
Types ¶
type BagreaderCloser ¶
type BagreaderCloser struct { *bagit.Reader // the zip reader // contains filtered or unexported fields }
A BagreaderCloser is a bagit.Reader which will also close the underlying file.
func OpenBundle ¶
func OpenBundle(s store.Store, key string) (*BagreaderCloser, error)
OpenBundle opens the provided key in the given store, and wraps it in a bagit reader.
func (*BagreaderCloser) Close ¶
func (bg *BagreaderCloser) Close() error
Close flushes the reader and closes the underlying io.Closer.
type Blob ¶
type Blob struct { ID BlobID SaveDate time.Time Creator string Size int64 // logical size of associated content (i.e. before compression) // following valid if blob is NOT deleted Bundle int // which bundle file this blob is stored in, 0 if deleted MD5 []byte // unused if deleted SHA256 []byte // unused if deleted MimeType string // either empty or the mime type of this blob // following valid if blob is deleted DeleteDate time.Time // zero iff not deleted Deleter string // empty iff not deleted DeleteNote string // optional note for deletion event }
Blob records metadata for each blob.
type BundleWriter ¶
type BundleWriter struct {
// contains filtered or unexported fields
}
BundleWriter helps with saving blobs into bundles, and with repackaging blobs when doing deletions. It keeps a reference to its source item, and will use that to save the item-info.json file when needed.
It is not goroutine safe. Make sure to call Close when finished.
func NewBundler ¶
func NewBundler(s store.Store, item *Item) *BundleWriter
NewBundler starts a new bundle writer for the given item. More than one bundle file may be written. The advancement to a new bundle file happens either when the current one grows larger than IdealBundleSize, or when Next() is called.
func (*BundleWriter) Close ¶
func (bw *BundleWriter) Close() error
Close writes out any final metadata and closes the current bundle.
func (*BundleWriter) CopyBundleExcept ¶
func (bw *BundleWriter) CopyBundleExcept(src int, except []BlobID) error
CopyBundleExcept copies all the blobs in the bundle src, except for those in the list, into the current place in the bundle writer.
func (*BundleWriter) CurrentBundle ¶
func (bw *BundleWriter) CurrentBundle() int
CurrentBundle returns the id of the bundle being written to.
func (*BundleWriter) Next ¶
func (bw *BundleWriter) Next() error
Next closes the current bundle, if any, and starts a new bundle file.
func (*BundleWriter) WriteBlob ¶
WriteBlob writes the given blob into the bundle.
WriteBlob first sees if it needs to start a new bundle file based on the number of bytes already written into the current bundle. At the end of the call, CurrentBundle() returns the bundle the blob was written into.
If WrittenMD5 is empty, then the file was not created in the bundle.
The *Blob is not modified and no validation of the write is performed. Use ValidateWriteBlob() to do validation of the returned Results with the expected values in the *Blob.
type Item ¶
type Item struct { ID string MaxBundle int // largest bundle id used by this item Blobs []*Blob // list of blobs, sorted by id Versions []*Version // list of versions, sorted by id }
An Item contains the information for a single item.
func (Item) BlobByExtendedSlot ¶
BlobByExtendedSlot return the blob idenfifer for the given extended slot name. An extended slot name is a slot name with an optional "@nnn/" prefix, where nnn is the version number of the item to use (in decimal). If a version prefix is not present, the most recent version of the item is used. Like BlobByVersionSlot, 0 is returned if the slot path does not resolve to anything.
type ItemCache ¶
type ItemCache interface { // try to return an item record with the given id. // return nil if there is nothing matching in the cache. Lookup(id string) *Item Set(id string, item *Item) }
An ItemCache defines the methods a Store will use to interact with a cache.
func NewMemoryCache ¶
func NewMemoryCache() ItemCache
NewMemoryCache returns an empty ItemCache that keeps everything in memory and never evicts anything. It is probably only useful in tests.
type NoBlobError ¶
func (NoBlobError) Error ¶
func (err NoBlobError) Error() string
type Results ¶
Results is used to return info from BundleWriter.WriteBlob(). Both WrittenMD5 and WrittenSHA256 are empty if nothing was written.
type Store ¶
type Store struct { S store.Store // the underlying bundle store // contains filtered or unexported fields }
A Store holds a collection of items
func NewWithCache ¶
NewWithCache creates a new item store which caches the item metadata in the given cache. (Should be deprecated??)
func (*Store) Blob ¶
Blob returns an io.ReadCloser containing the given blob's contents and the blob's size. It will block until the item and blob are loaded from the backing store.
TODO: perhaps this should be moved to be a method on an Item*
func (*Store) BlobInfo ¶
BlobInfo returns a pointer to a Blob structure containing information on the given blob. It is like Blob() but doesn't recall the content from tape. Unlike Blob(), though, it will not return an error if the blob is deleted.
func (*Store) Item ¶
Item loads and return an item's metadata info. This will block until the item is loaded.
func (*Store) List ¶
List returns a channel which will contain all of the item ids in the current store.
func (*Store) Open ¶
Open opens the item id for writing. This will add a single new version to the item. New blobs can be written. Blobs can also be deleted (but that is not a quick operation).
The creator is the name of the agent performing these updates.
It is an error for more than one goroutine to open the same item at a time. This does not perform any locking itself.
func (*Store) SetCache ¶
SetCache will set the metadata cache used. It is intended to be used during initialization. It will cause a race condition if used while others are accessing this item store.
func (*Store) SetUseStore ¶
SetUseStore enables or disables access to the underlying store. true- on/ false-off
func (*Store) Validate ¶
Validate the given item. Returns the total amount checksummed (in bytes), a list of issues which will be empty if everything is fine, and an error if an error happened during the validation. In particular, err does not show validation errors, only if a system error happened while validating.
Things checked (not all are implemented yet): * Each blob has the correct checksum * Each blob appears in exactly one bundle * Every blob is assigned to at least one slot in at least one version * Each slot points to an existing (possibly deleted) blob * Each bundle is readable and in the correct format * There are no extra files in a bundle * All required metadata fields are present for each blob * All required metadata fields are present for each version
This is a method on the Store instead of an Item since it needs access to the underlying bundle files.
type Version ¶
type Version struct { ID VersionID SaveDate time.Time Creator string Note string Slots map[string]BlobID }
Version contains the metadata on a single item version.
type Writer ¶
type Writer struct {
// contains filtered or unexported fields
}
A Writer implements an io.Writer with extra methods to save a new version of an Item.
func (*Writer) ClearSlots ¶
func (wr *Writer) ClearSlots()
ClearSlots will remove all the slot information for the current version. Any slot entries made before calling this will be lost (but the blobs will still be around!).
func (*Writer) Close ¶
Close closes the given Writer. The final metadata is written out, and any blobs marked for deletion are extracted and removed.
func (*Writer) DeleteBlob ¶
DeleteBlob marks the given blob for removal from the underlying storage. Blobs will be removed when Close() is called. Removal may take a while since every other blob in the bundle the blob is stored in will be copied into a new bundle.
This function should be used infrequently. What is probably desired is to make a new version with the given slot removed by calling SetSlot with a 0 as a blob id.
func (*Writer) SetCreator ¶
SetCreator sets the creator metadata field. (Remove?)
func (*Writer) SetMimeType ¶
SetMimeType sets the mime type for the given blob. Nothing is changed if no blob has the given id or if the blob has been deleted.
func (*Writer) SetSlot ¶
SetSlot adds a slot mapping for this version. To explicitly remove a slot, set it to 0. The slot mapping is initialized to that of the previous version.
func (*Writer) WriteBlob ¶
WriteBlob signifies the intent to copy the given io.Reader into this item. If size and the hashes are provided, the item is checked to see if there is already a blob with them in this item. If there is, that blob id is returned and r is not read at all.
If such a blob is not already in the item, WriteBlob will copy the io.Reader into the item as a new blob. The hashes and size are compared with the data read from r and an error is triggered if there is a difference.
The hashes and size may be nil and 0 if unknown, in which case they will be calculated and stored as needed, and no mismatch error will be triggered.
If there is an error writing the blob, the blob is not added to the item's blob list, and the id of 0 is returned. There may be a remnant "blob/{id}" entry in the zip file, so it is best to close this Writer and reopen before retrying writing the blob.
type Zipwriter ¶
type Zipwriter struct { *bagit.Writer // the zip interface over the bundle file // contains filtered or unexported fields }
A Zipwriter wraps the zip.Writer object to track the underlying file stream holding the zip file's complete contents. Some utility methods are added to make our life easier.
func OpenZipWriter ¶
OpenZipWriter creates a new bundle in the given store using the given id and bundle number. It returns a zip writer which is then saved into the store.
func (*Zipwriter) Close ¶
Close writes out the zip directory information and then closes the underlying file descriptor for this bundle file.
func (*Zipwriter) MakeStream ¶
MakeStream returns a writer which saves a file with the given name inside this zip file. The writer does not need to be closed when finished. Only one stream can be active at a time, and call MakeStream again to start the next stream.