carbites

package module
v0.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 4, 2021 License: Apache-2.0, MIT Imports: 15 Imported by: 16

README

carbites

Build Standard README Go Report Card

Chunking for CAR files. Split a single CAR into multiple CARs.

Install

go get github.com/alanshaw/go-carbites

Usage

Carbites supports 2 different strategies:

  1. Simple (default) - fast but naive, only the first CAR output has a root CID, subsequent CARs have a placeholder "empty" CID. The first CAR output has roots in the header, subsequent CARs have an empty root CID bafkqaaa as recommended.
  2. Treewalk - walks the DAG to pack sub-graphs into each CAR file that is output. Every CAR has the same root CID, but contains a different portion of the DAG. Every CAR file has the same root CID but a different portion of the DAG. The DAG is traversed from the root node and each block is decoded and links extracted in order to determine which sub-graph to include in each CAR.
package main

import (
	"github.com/alanshaw/go-carbites"
	"github.com/ipld/go-car"
)

func main() {
    out := make(chan io.Reader)

    go func() {
        var i int
        for {
            select {
            case r := <-out:
                b, _ := ioutil.ReadAll(r)
                ioutil.WriteFile(fmt.Sprintf("chunk-%d.car", i), b, 0644)
                i++
            }
        }
    }()

    car, _ := car.NewCarReader(reader)
    targetSize := 1000 // 1kb chunks
    strategy := carbites.Simple // also carbites.TreeWalk
    err := carbites.Split(context.Background(), car, targetSize, strategy, out)
}

API

pkg.go.dev Reference

Contribute

Feel free to dive in! Open an issue or submit PRs.

License

Dual-licensed under MIT + Apache 2.0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Join added in v0.1.1

func Join(in []io.Reader, s Strategy) (io.Reader, error)

Join together multiple CAR files into a single CAR file.

func JoinSimple added in v0.1.1

func JoinSimple(in []io.Reader) (io.Reader, error)

Join together multiple CAR files that were split using the "simple" strategy into a single CAR file.

func JoinTreewalk added in v0.3.0

func JoinTreewalk(in []io.Reader) (io.Reader, error)

Join together multiple CAR files into a single CAR file using the "treewalk" strategy. Note that binary equality between the original CAR and the joined CAR is not guaranteed.

func NewCarMerger added in v0.3.0

func NewCarMerger(in []io.Reader) (io.Reader, error)

NewCarMerger creates a new CAR file (an io.Reader) that is a result of merging the passed CAR files. The resultant CAR has the combined roots of the passed CAR files and any duplicate blocks are removed.

func Split

func Split(ctx context.Context, in io.Reader, targetSize int, s Strategy, out chan io.Reader) error

Split a CAR file and create multiple smaller CAR files.

func SplitSimple added in v0.1.0

func SplitSimple(ctx context.Context, in io.Reader, targetSize int, out chan io.Reader) error

Split a CAR file and create multiple smaller CAR files using the "simple" strategy.

func SplitTreewalk added in v0.1.0

func SplitTreewalk(ctx context.Context, r io.Reader, targetSize int, out chan io.Reader) error

Split a CAR file and create multiple smaller CAR files using the "treewalk" strategy. Note: the entire CAR will be cached in memory. Use SplitTreewalkFromPath or SplitTreewalkFromBlockReader for non-memory bound splitting.

func SplitTreewalkFromBlockReader added in v0.2.0

func SplitTreewalkFromBlockReader(ctx context.Context, root cid.Cid, br CarBlockReader, targetSize int, out chan io.Reader) error

Split a CAR file (passed as a root CID and a block reader populated with the blocks from the CAR) and create multiple smaller CAR files using the "treewalk" strategy.

func SplitTreewalkFromPath added in v0.2.0

func SplitTreewalkFromPath(ctx context.Context, path string, targetSize int, out chan io.Reader) error

Split a CAR file found on disk at the given path and create multiple smaller CAR files using the "treewalk" strategy.

Types

type CarBlockReader added in v0.1.0

type CarBlockReader interface {
	Get(cid.Cid) (blocks.Block, error)
}

type Strategy

type Strategy int

Strategy describes how CAR files should be split.

const (
	// Simple is fast but naive, only the first CAR output has a root CID,
	// subsequent CARs have a placeholder "empty" CID.
	Simple Strategy = iota
	// Treewalk walks the DAG to pack sub-graphs into each CAR file that is
	// output. Every CAR has the same root CID, but contains a different portion
	// of the DAG.
	Treewalk
)

Directories

Path Synopsis
cmd module

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL