chunk

package module
v0.0.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 29, 2021 License: MIT Imports: 10 Imported by: 9

README

go-dms3-chunker

standard-readme compliant GoDoc Build Status

go-ipfs-chunker implements data Splitters for go-ipfs.

go-ipfs-chunker provides the Splitter interface. IPFS splitters read data from a reader an create "chunks". These chunks are used to build the ipfs DAGs (Merkle Tree) and are the base unit to obtain the sums that ipfs uses to address content.

The package provides a SizeSplitter which creates chunks of equal size and it is used by default in most cases, and a rabin fingerprint chunker. This chunker will attempt to split data in a way that the resulting blocks are the same when the data has repetitive patterns, thus optimizing the resulting DAGs.

Lead Maintainer

Steven Allen

Table of Contents

Install

go-ipfs-chunker works like a regular Go module:

> go get github.com/ipfs/go-ipfs-chunker

Usage

import "github.com/ipfs/go-ipfs-chunker"

Check the GoDoc documentation

Contribute

PRs accepted.

Small note: If editing the README, please conform to the standard-readme specification.

License

MIT © Protocol Labs, Inc.

Documentation

Overview

Package chunk implements streaming block splitters. Splitters read data from a reader and provide byte slices (chunks) The size and contents of these slices depend on the splitting method used.

Index

Constants

View Source
const (
	// DefaultBlockSize is the chunk size that splitters produce (or aim to).
	DefaultBlockSize int64 = 1024 * 256

	// No leaf block should contain more than 1MiB of payload data ( wrapping overhead aside )
	// This effectively mandates the maximum chunk size
	// See discussion at https://gitlab.dms3.io/dms3/go-dms3-chunker/pull/21#discussion_r369124879 for background
	ChunkSizeLimit int = 1048576
)

Variables

View Source
var (
	ErrRabinMin = errors.New("rabin min must be greater than 16")
	ErrSize     = errors.New("chunker size must be greater than 0")
	ErrSizeMax  = fmt.Errorf("chunker parameters may not exceed the maximum chunk size of %d", ChunkSizeLimit)
)
View Source
var Dms3RabinPoly = chunker.Pol(17437180132763653)

Dms3RabinPoly is the irreducible polynomial of degree 53 used by for Rabin.

Functions

func Chan

func Chan(s Splitter) (<-chan []byte, <-chan error)

Chan returns a channel that receives each of the chunks produced by a splitter, along with another one for errors.

Types

type Buzhash

type Buzhash struct {
	// contains filtered or unexported fields
}

func NewBuzhash

func NewBuzhash(r io.Reader) *Buzhash

func (*Buzhash) NextBytes

func (b *Buzhash) NextBytes() ([]byte, error)

func (*Buzhash) Reader

func (b *Buzhash) Reader() io.Reader

type Rabin

type Rabin struct {
	// contains filtered or unexported fields
}

Rabin implements the Splitter interface and splits content with Rabin fingerprints.

func NewRabin

func NewRabin(r io.Reader, avgBlkSize uint64) *Rabin

NewRabin creates a new Rabin splitter with the given average block size.

func NewRabinMinMax

func NewRabinMinMax(r io.Reader, min, avg, max uint64) *Rabin

NewRabinMinMax returns a new Rabin splitter which uses the given min, average and max block sizes.

func (*Rabin) NextBytes

func (r *Rabin) NextBytes() ([]byte, error)

NextBytes reads the next bytes from the reader and returns a slice.

func (*Rabin) Reader

func (r *Rabin) Reader() io.Reader

Reader returns the io.Reader associated to this Splitter.

type Splitter

type Splitter interface {
	Reader() io.Reader
	NextBytes() ([]byte, error)
}

A Splitter reads bytes from a Reader and creates "chunks" (byte slices) that can be used to build DAG nodes.

func DefaultSplitter

func DefaultSplitter(r io.Reader) Splitter

DefaultSplitter returns a SizeSplitter with the DefaultBlockSize.

func FromString

func FromString(r io.Reader, chunker string) (Splitter, error)

FromString returns a Splitter depending on the given string: it supports "default" (""), "size-{size}", "rabin", "rabin-{blocksize}", "rabin-{min}-{avg}-{max}" and "buzhash".

func NewSizeSplitter

func NewSizeSplitter(r io.Reader, size int64) Splitter

NewSizeSplitter returns a new size-based Splitter with the given block size.

type SplitterGen

type SplitterGen func(r io.Reader) Splitter

SplitterGen is a splitter generator, given a reader.

func SizeSplitterGen

func SizeSplitterGen(size int64) SplitterGen

SizeSplitterGen returns a SplitterGen function which will create a splitter with the given size when called.

Directories

Path Synopsis
This file generates bytehash LUT
This file generates bytehash LUT

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL