blockstore

package
v2.14.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 25, 2024 License: Apache-2.0, MIT Imports: 17 Imported by: 88

Documentation

Overview

Package blockstore implements the IPFS blockstore interface backed by a CAR file. This package provides two flavours of blockstore: ReadOnly and ReadWrite.

The ReadOnly blockstore provides a read-only random access from a given data payload either in unindexed CARv1 format or indexed/unindexed v2 format:

  • ReadOnly.NewReadOnly can be used to instantiate a new read-only blockstore for a given CARv1 or CARv2 data payload with an optional index override.
  • ReadOnly.OpenReadOnly can be used to instantiate a new read-only blockstore for a given CARv1 or CARv2 file with automatic index generation if the index is not present.

The ReadWrite blockstore allows writing and reading of the blocks concurrently. The user of this blockstore is responsible for calling ReadWrite.Finalize when finished writing blocks. Upon finalization, the instance can no longer be used for reading or writing blocks and will error if used. To continue reading the blocks users are encouraged to use ReadOnly blockstore instantiated from the same file path using OpenReadOnly. A user may resume reading/writing from files produced by an instance of ReadWrite blockstore. The resumption is attempted automatically, if the path passed to OpenReadWrite exists.

Note that the blockstore implementations in this package behave similarly to IPFS IdStore wrapper when given CIDs with multihash.IDENTITY code. More specifically, for CIDs with multhash.IDENTITY code: * blockstore.Has will always return true. * blockstore.Get will always succeed, returning the multihash digest of the given CID. * blockstore.GetSize will always succeed, returning the multihash digest length of the given CID. * blockstore.Put and blockstore.PutMany will always succeed without performing any operation unless car.StoreIdentityCIDs is enabled.

See: https://pkg.go.dev/github.com/ipfs/boxo/blockstore#NewIdStore

Index

Examples

Constants

This section is empty.

Variables

View Source
var AllowDuplicatePuts = carv2.AllowDuplicatePuts
View Source
var UseWholeCIDs = carv2.UseWholeCIDs
View Source
var WriteAsCarV1 = carv2.WriteAsCarV1

Functions

func WithAsyncErrorHandler

func WithAsyncErrorHandler(ctx context.Context, errHandler func(error)) context.Context

WithAsyncErrorHandler returns a context with async error handling set to the given errHandler. Any errors that occur during asynchronous operations of AllKeysChan will be passed to the given handler.

Types

type Blockstore added in v2.10.1

type Blockstore interface {
	DeleteBlock(context.Context, cid.Cid) error
	Has(context.Context, cid.Cid) (bool, error)
	Get(context.Context, cid.Cid) (blocks.Block, error)
	GetSize(context.Context, cid.Cid) (int, error)
	Put(context.Context, blocks.Block) error
	PutMany(context.Context, []blocks.Block) error
	AllKeysChan(ctx context.Context) (<-chan cid.Cid, error)
	HashOnRead(enabled bool)
}

Blockstore is compatible with github.com/ipfs/go-ipfs-blockstore.Blockstore and github.com/ipfs/boxo/blockstore.Blockstore.

type ReadOnly

type ReadOnly struct {
	// contains filtered or unexported fields
}

ReadOnly provides a read-only CAR Block Store.

func NewReadOnly

func NewReadOnly(backing io.ReaderAt, idx index.Index, opts ...carv2.Option) (*ReadOnly, error)

NewReadOnly creates a new ReadOnly blockstore from the backing with a optional index as idx. This function accepts both CARv1 and CARv2 backing. The blockstore is instantiated with the given index if it is not nil.

Otherwise: * For a CARv1 backing an index is generated. * For a CARv2 backing an index is only generated if Header.HasIndex returns false.

There is no need to call ReadOnly.Close on instances returned by this function.

func OpenReadOnly

func OpenReadOnly(path string, opts ...carv2.Option) (*ReadOnly, error)

OpenReadOnly opens a read-only blockstore from a CAR file (either v1 or v2), generating an index if it does not exist. Note, the generated index if the index does not exist is ephemeral and only stored in memory. See car.GenerateIndex and Index.Attach for persisting index onto a CAR file.

Example

ExampleOpenReadOnly opens a read-only blockstore from a CARv1 file, and prints its root CIDs along with CID mapping to raw data size of blocks for first five sections in the CAR file.

package main

import (
	"context"
	"fmt"

	carv2 "github.com/ipld/go-car/v2"
	"github.com/ipld/go-car/v2/blockstore"
)

const cidPrintCount = 5

func main() {
	// Open a new ReadOnly blockstore from a CARv1 file.
	// Note, `OpenReadOnly` accepts bot CARv1 and CARv2 formats and transparently generate index
	// in the background if necessary.
	// This instance sets ZeroLengthSectionAsEOF option to treat zero sized sections in file as EOF.
	robs, err := blockstore.OpenReadOnly("../testdata/sample-v1.car",
		blockstore.UseWholeCIDs(true),
		carv2.ZeroLengthSectionAsEOF(true),
	)
	if err != nil {
		panic(err)
	}
	defer robs.Close()

	// Print root CIDs.
	roots, err := robs.Roots()
	if err != nil {
		panic(err)
	}
	fmt.Printf("Contains %v root CID(s):\n", len(roots))
	for _, r := range roots {
		fmt.Printf("\t%v\n", r)
	}

	ctx, cancel := context.WithCancel(context.Background())
	defer cancel()

	// Print the raw data size for the first 5 CIDs in the CAR file.
	keysChan, err := robs.AllKeysChan(ctx)
	if err != nil {
		panic(err)
	}
	fmt.Printf("List of first %v CIDs and their raw data size:\n", cidPrintCount)
	i := 1
	for k := range keysChan {
		if i > cidPrintCount {
			cancel()
			break
		}
		size, err := robs.GetSize(context.TODO(), k)
		if err != nil {
			panic(err)
		}
		fmt.Printf("\t%v -> %v bytes\n", k, size)
		i++
	}

}
Output:

Contains 1 root CID(s):
	bafy2bzaced4ueelaegfs5fqu4tzsh6ywbbpfk3cxppupmxfdhbpbhzawfw5oy
List of first 5 CIDs and their raw data size:
	bafy2bzaced4ueelaegfs5fqu4tzsh6ywbbpfk3cxppupmxfdhbpbhzawfw5oy -> 821 bytes
	bafy2bzaceaycv7jhaegckatnncu5yugzkrnzeqsppzegufr35lroxxnsnpspu -> 1053 bytes
	bafy2bzaceb62wdepofqu34afqhbcn4a7jziwblt2ih5hhqqm6zitd3qpzhdp4 -> 1094 bytes
	bafy2bzaceb3utcspm5jqcdqpih3ztbaztv7yunzkiyfq7up7xmokpxemwgu5u -> 1051 bytes
	bafy2bzacedjwekyjresrwjqj4n2r5bnuuu3klncgjo2r3slsp6wgqb37sz4ck -> 821 bytes

func (*ReadOnly) AllKeysChan

func (b *ReadOnly) AllKeysChan(ctx context.Context) (<-chan cid.Cid, error)

AllKeysChan returns the list of keys in the CAR data payload. If the ctx is constructed using WithAsyncErrorHandler any errors that occur during asynchronous retrieval of CIDs will be passed to the error handler function set in context. Otherwise, errors will terminate the asynchronous operation silently.

See WithAsyncErrorHandler

func (*ReadOnly) Close

func (b *ReadOnly) Close() error

Close closes the underlying reader if it was opened by OpenReadOnly. After this call, the blockstore can no longer be used.

Note that this call may block if any blockstore operations are currently in progress, including an AllKeysChan that hasn't been fully consumed or cancelled.

func (*ReadOnly) DeleteBlock

func (b *ReadOnly) DeleteBlock(_ context.Context, _ cid.Cid) error

DeleteBlock is unsupported and always errors.

func (*ReadOnly) Get

func (b *ReadOnly) Get(ctx context.Context, key cid.Cid) (blocks.Block, error)

Get gets a block corresponding to the given key. This function always returns the block for any given key with multihash.IDENTITY code unless the StoreIdentityCIDs option is on, in which case it will defer to the index to check for the existence of the block; the index may or may not contain identity CIDs included in this CAR, depending on whether StoreIdentityCIDs was on when the index was created. If the CAR is a CARv1 and StoreIdentityCIDs is on, then the index will contain identity CIDs and this will always return true.

func (*ReadOnly) GetSize

func (b *ReadOnly) GetSize(ctx context.Context, key cid.Cid) (int, error)

GetSize gets the size of an item corresponding to the given key.

func (*ReadOnly) Has

func (b *ReadOnly) Has(ctx context.Context, key cid.Cid) (bool, error)

Has indicates if the store contains a block that corresponds to the given key. This function always returns true for any given key with multihash.IDENTITY code unless the StoreIdentityCIDs option is on, in which case it will defer to the index to check for the existence of the block; the index may or may not contain identity CIDs included in this CAR, depending on whether StoreIdentityCIDs was on when the index was created. If the CAR is a CARv1 and StoreIdentityCIDs is on, then the index will contain identity CIDs and this will always return true.

func (*ReadOnly) HashOnRead

func (b *ReadOnly) HashOnRead(bool)

HashOnRead is currently unimplemented; hashing on reads never happens.

func (*ReadOnly) Index added in v2.9.0

func (b *ReadOnly) Index() index.Index

Index gives direct access to the index. You should never add records on your own there.

func (*ReadOnly) Put

Put is not supported and always returns an error.

func (*ReadOnly) PutMany

func (b *ReadOnly) PutMany(context.Context, []blocks.Block) error

PutMany is not supported and always returns an error.

func (*ReadOnly) Roots

func (b *ReadOnly) Roots() ([]cid.Cid, error)

Roots returns the root CIDs of the backing CAR.

type ReadWrite

type ReadWrite struct {
	// contains filtered or unexported fields
}

ReadWrite implements a blockstore that stores blocks in CARv2 format. Blocks put into the blockstore can be read back once they are successfully written. This implementation is preferable for a write-heavy workload. The blocks are written immediately on Put and PutAll calls, while the index is stored in memory and updated incrementally.

The Finalize function must be called once the putting blocks are finished. Upon calling Finalize header is finalized and index is written out. Once finalized, all read and write calls to this blockstore will result in errors.

func OpenReadWrite

func OpenReadWrite(path string, roots []cid.Cid, opts ...carv2.Option) (*ReadWrite, error)

OpenReadWrite creates a new ReadWrite at the given path with a provided set of root CIDs and options.

ReadWrite.Finalize must be called once putting and reading blocks are no longer needed. Upon calling ReadWrite.Finalize the CARv2 header and index are written out onto the file and the backing file is closed. Once finalized, all read and write calls to this blockstore will result in errors. Note, ReadWrite.Finalize must be called on an open instance regardless of whether any blocks were put or not.

If a file at given path does not exist, the instantiation will write car.Pragma and data payload header (i.e. the inner CARv1 header) onto the file before returning.

When the given path already exists, the blockstore will attempt to resume from it. On resumption the existing data sections in file are re-indexed, allowing the caller to continue putting any remaining blocks without having to re-ingest blocks for which previous ReadWrite.Put returned successfully.

Resumption only works on files that were created by a previous instance of a ReadWrite blockstore. This means a file created as a result of a successful call to OpenReadWrite can be resumed from as long as write operations such as ReadWrite.Put, ReadWrite.PutMany returned successfully. On resumption the roots argument and WithDataPadding option must match the previous instantiation of ReadWrite blockstore that created the file. More explicitly, the file resuming from must:

  1. start with a complete CARv2 car.Pragma.
  2. contain a complete CARv1 data header with root CIDs matching the CIDs passed to the constructor, starting at offset optionally padded by WithDataPadding, followed by zero or more complete data sections. If any corrupt data sections are present the resumption will fail. Note, if set previously, the blockstore must use the same WithDataPadding option as before, since this option is used to locate the CARv1 data payload.

Note, resumption should be used with WithCidDeduplication, so that blocks that are successfully written into the file are not re-written. Unless, the user explicitly wants duplicate blocks.

Resuming from finalized files is allowed. However, resumption will regenerate the index regardless by scanning every existing block in file.

Example

ExampleOpenReadWrite creates a read-write blockstore and puts

ctx, cancel := context.WithTimeout(context.Background(), time.Second)
defer cancel()

thisBlock := blocks.NewBlock([]byte("fish"))
thatBlock := blocks.NewBlock([]byte("lobster"))
andTheOtherBlock := blocks.NewBlock([]byte("barreleye"))

tdir, err := os.MkdirTemp(os.TempDir(), "example-*")
if err != nil {
	panic(err)
}
dst := filepath.Join(tdir, "sample-rw-bs-v2.car")
roots := []cid.Cid{thisBlock.Cid(), thatBlock.Cid(), andTheOtherBlock.Cid()}

rwbs, err := blockstore.OpenReadWrite(dst, roots, carv2.UseDataPadding(1413), carv2.UseIndexPadding(42))
if err != nil {
	panic(err)
}

// Put all blocks onto the blockstore.
blocks := []blocks.Block{thisBlock, thatBlock}
if err := rwbs.PutMany(ctx, blocks); err != nil {
	panic(err)
}
fmt.Printf("Successfully wrote %v blocks into the blockstore.\n", len(blocks))

// Any blocks put can be read back using the same blockstore instance.
block, err := rwbs.Get(ctx, thatBlock.Cid())
if err != nil {
	panic(err)
}
fmt.Printf("Read back block just put with raw value of `%v`.\n", string(block.RawData()))

// Finalize the blockstore to flush out the index and make a complete CARv2.
if err := rwbs.Finalize(); err != nil {
	panic(err)
}

// Resume from the same file to add more blocks.
// Note the UseDataPadding and roots must match the values passed to the blockstore instance
// that created the original file. Otherwise, we cannot resume from the same file.
resumedRwbos, err := blockstore.OpenReadWrite(dst, roots, carv2.UseDataPadding(1413))
if err != nil {
	panic(err)
}

// Put another block, appending it to the set of blocks that are written previously.
if err := resumedRwbos.Put(ctx, andTheOtherBlock); err != nil {
	panic(err)
}

// Read back the the block put before resumption.
// Blocks previously put are present.
block, err = resumedRwbos.Get(ctx, thatBlock.Cid())
if err != nil {
	panic(err)
}
fmt.Printf("Resumed blockstore contains blocks put previously with raw value of `%v`.\n", string(block.RawData()))

// Put an additional block to the CAR.
// Blocks put after resumption are also present.
block, err = resumedRwbos.Get(ctx, andTheOtherBlock.Cid())
if err != nil {
	panic(err)
}
fmt.Printf("It also contains the block put after resumption with raw value of `%v`.\n", string(block.RawData()))

// Finalize the blockstore to flush out the index and make a complete CARv2.
// Note, Finalize must be called on an open ReadWrite blockstore to flush out a complete CARv2.
if err := resumedRwbos.Finalize(); err != nil {
	panic(err)
}
Output:

Successfully wrote 2 blocks into the blockstore.
Read back block just put with raw value of `lobster`.
Resumed blockstore contains blocks put previously with raw value of `lobster`.
It also contains the block put after resumption with raw value of `barreleye`.

func OpenReadWriteFile added in v2.5.0

func OpenReadWriteFile(f *os.File, roots []cid.Cid, opts ...carv2.Option) (*ReadWrite, error)

OpenReadWriteFile is similar as OpenReadWrite but lets you control the file lifecycle. You are responsible for closing the given file.

func (*ReadWrite) AllKeysChan

func (b *ReadWrite) AllKeysChan(ctx context.Context) (<-chan cid.Cid, error)

func (*ReadWrite) Close added in v2.8.0

func (b *ReadWrite) Close() error

Close closes the blockstore. After this call, the blockstore can no longer be used.

func (*ReadWrite) DeleteBlock

func (b *ReadWrite) DeleteBlock(_ context.Context, _ cid.Cid) error

func (*ReadWrite) Discard

func (b *ReadWrite) Discard()

Discard closes this blockstore without finalizing its header and index. After this call, the blockstore can no longer be used.

Note that this call may block if any blockstore operations are currently in progress, including an AllKeysChan that hasn't been fully consumed or cancelled.

func (*ReadWrite) Finalize

func (b *ReadWrite) Finalize() error

Finalize finalizes this blockstore by writing the CARv2 header, along with flattened index for more efficient subsequent read. This is the equivalent to calling FinalizeReadOnly and Close. After this call, the blockstore can no longer be used.

func (*ReadWrite) FinalizeReadOnly added in v2.8.0

func (b *ReadWrite) FinalizeReadOnly() error

Finalize finalizes this blockstore by writing the CARv2 header, along with flattened index for more efficient subsequent read, but keep it open read-only. This call should be complemented later by a call to Close.

func (*ReadWrite) Get

func (b *ReadWrite) Get(ctx context.Context, key cid.Cid) (blocks.Block, error)

func (*ReadWrite) GetSize

func (b *ReadWrite) GetSize(ctx context.Context, key cid.Cid) (int, error)

func (*ReadWrite) Has

func (b *ReadWrite) Has(ctx context.Context, key cid.Cid) (bool, error)

func (*ReadWrite) HashOnRead

func (b *ReadWrite) HashOnRead(enable bool)

func (*ReadWrite) Index added in v2.9.0

func (b *ReadWrite) Index() index.Index

Index gives direct access to the index. You should never add records on your own there.

func (*ReadWrite) Put

func (b *ReadWrite) Put(ctx context.Context, blk blocks.Block) error

Put puts a given block to the underlying datastore

func (*ReadWrite) PutMany

func (b *ReadWrite) PutMany(ctx context.Context, blks []blocks.Block) error

PutMany puts a slice of blocks at the same time using batching capabilities of the underlying datastore whenever possible.

func (*ReadWrite) Roots

func (b *ReadWrite) Roots() ([]cid.Cid, error)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL