stitch

package module

v0.0.1-demo Latest Latest Go to latest Published: Aug 20, 2022 License: MPL-2.0 Imports: 17 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/OhanaFS/stitch

README ¶

stitch 🩹

Compress, encrypt, and split files into pieces. Then stitch them together again.

# Building
make

# Testing
make test

# View documentation
make doc

Once you've executed make doc, you can view the documentation with your browser at http://localhost:6060/pkg/github.com/OhanaFS/stitch/

How it works

pipeline

Use the command-line interface

Currently there is a basic CLI for the pipeline encoder:

go run ./cmd/stitch pipeline --help

To encode files, use the -input flag:

go run ./cmd/stitch pipeline -input file.bin

The command will create file.bin.shardX files in the same directory. To decode, use the -output flag:

go run ./cmd/stitch pipeline -output file.bin

The command will look for file.bin.shardX files and use it to reconstruct file.bin.

Documentation ¶

Overview ¶

Stitch is a tool to compress, encrypt, and split any data into a set of shards.

Example ¶

A simple example to demonstrate how to use the Encoder and ReadSeeker.

package main

import (
	"fmt"
	"io"
	"os"

	"github.com/OhanaFS/stitch"
)

func main() {
	// Create a new encoder.
	encoder := stitch.NewEncoder(&stitch.EncoderOptions{
		DataShards:   2,
		ParityShards: 1,
		KeyThreshold: 2,
	})

	// Open the input file.
	input, _ := os.Open("input.txt")
	defer input.Close()

	// Open the output files.
	out1, _ := os.Create("output.shard1")
	defer out1.Close()
	out2, _ := os.Create("output.shard2")
	defer out2.Close()

	// Use a dummy key and IV.
	key := []byte("00000000000000000000000000000000")
	iv := []byte("000000000000")

	// Encode the data.
	result, _ := encoder.Encode(input, []io.Writer{out1, out2}, key, iv)
	fmt.Printf("File size: %d\n", result.FileSize)
	fmt.Printf("File hash: %x\n", result.FileHash)

	// Decode the data.
	reader, _ := encoder.NewReadSeeker([]io.ReadSeeker{out1, out2}, key, iv)
	io.Copy(os.Stdout, reader)
}

Output:

Index ¶

Variables
type Encoder
- func NewEncoder(opts *EncoderOptions) *Encoder
type EncoderOptions
type EncodingResult
type ShardVerificationResult
- func VerifyShardIntegrity(shard io.Reader) (*ShardVerificationResult, error)
type VerificationResult

Examples ¶

Package

Constants ¶

This section is empty.

Variables ¶

View Source

var (
	ErrShardCountMismatch = errors.New("shard count mismatch")
	ErrNonSeekableWriter  = errors.New("shards must support seeking")
	ErrNotEnoughKeyShards = errors.New("not enough shards to reconstruct the file key")
	ErrNotEnoughShards    = errors.New("not enough shards to reconstruct the file")
	ErrNoCompleteHeader   = errors.New("no complete header found")
)

Functions ¶

This section is empty.

Types ¶

type Encoder ¶

type Encoder struct {
	// contains filtered or unexported fields
}

Encoder takes in a stream of data and shards it into a specified number of data and parity shards. It includes compression using zstd, encryption using AES-GCM, and splitting the data into equal-sized shards using Reed-Solomon.

It follows this process to encode the data into multiple shards:

Generate a random key Kr
Generate N output streams So_n
Generate a file header
Encrypt Kr with user-supplied key Ku, and embed it into the file header
Write the header to So_n
Take a byte stream of user-supplied data Sd and pipe it to the compressor C
Pipe the output of C into a streaming symmetric encryption method E, which uses Kr to encrypt
Pipe the output of E into Reed-Solomon encoder to get N output streams RS_n
Pipe the output of RS_n to So_n

func NewEncoder ¶

func NewEncoder(opts *EncoderOptions) *Encoder

func (*Encoder) Encode ¶

func (e *Encoder) Encode(data io.Reader, shards []io.Writer, key []byte, iv []byte) (*EncodingResult, error)

Encode takes in a reader, performs the transformations and then splits the data into multiple shards, writing them to the output writers. The output writers are not closed after the data is written.

After the data has finished encoding, a header will be written to the end of each shard. At this point, the shards are not usable yet until the header is finalized using the FinalizeHeader() function.

func (*Encoder) FinalizeHeader ¶

func (e *Encoder) FinalizeHeader(shard io.ReadWriteSeeker) error

FinalizeHeader rewrites the shard header with the one located at the end of the shard. If the provided shard is an *os.File, the header at the end of the file will be truncated.

func (*Encoder) NewReadSeeker ¶

func (e *Encoder) NewReadSeeker(shards []io.ReadSeeker, key []byte, iv []byte) (
	io.ReadSeeker, error,
)

NewReadSeeker returns a new ReadSeeker that can be used to access the data contained within the shards.

func (*Encoder) RotateKeys ¶

func (e *Encoder) RotateKeys(shards []io.ReadSeeker,
	previousKey, previousIv, newKey, newIv []byte) ([][]byte, error)

RotateKeys reads the header from the supplied shards, reconstructs the file key, and then decrypts it with the supplied key and iv. It will then re-encrypt it with the new key and iv, and split them with Shamir's Secret Sharing Scheme. The resulting key splits are then returned.

The caller must then use the UpdateShardKey() function to update each shard's header to use the new key splits.

func (*Encoder) UpdateShardKey ¶

func (*Encoder) UpdateShardKey(shard io.ReadWriteSeeker, newKeySplit []byte) error

UpdateShardKey updates the header of the supplied shard with the new key split. The header is then written to the shard. To obtain a new key split, use the RotateKeys() function.

func (*Encoder) VerifyIntegrity ¶

func (e *Encoder) VerifyIntegrity(shards []io.ReadSeeker) (*VerificationResult, error)

VerifyIntegrity tries to read and verify the integrity of all the provided shards. An error is returned if it is not possible to recover the original file.

type EncoderOptions ¶

type EncoderOptions struct {
	// DataShards is the total number of shards to split data into.
	DataShards uint8
	// ParityShards is the total number of parity shards to create. This also
	// determines the maximum number of shards that can be lost before the data
	// cannot be recovered.
	ParityShards uint8
	// KeyThreshold is the minimum number of shards required to reconstruct the
	// key used to encrypt the data.
	KeyThreshold uint8
}

EncoderOptions specifies options for the Encoder.

type EncodingResult ¶

type EncodingResult struct {
	// FileSize is the size of the input file in bytes.
	FileSize uint64
	// FileHash is the SHA256 hash of the input file.
	FileHash []byte
}

type ShardVerificationResult ¶

type ShardVerificationResult struct {
	// IsAvailable specifies whether the shard is readable at all.
	IsAvailable bool
	// IsHeaderComplete specifies whether the header in the shard is marked as
	// complete. An incomplete header indicates either a corrupt header, or a
	// shard that hasn't been finalized.
	IsHeaderComplete bool
	// ShardIndex is the index of the shard as specified in the header.
	ShardIndex int
	// BlocksCount is the number of blocks that are supposed to be in the file as
	// calculated by the shard's header
	BlocksCount int
	// BlocksFound is the total number of blocks actually found in the shard.
	BlocksFound int
	// BrokenBlocks is a slice of block indices that are corrupted, starting from
	// zero.
	BrokenBlocks []int
}

func VerifyShardIntegrity ¶

func VerifyShardIntegrity(shard io.Reader) (*ShardVerificationResult, error)

VerifyShardIntegrity tries to read through an entire shard, and report back any issues. If the shard is unreadable, an error will be returned.

type VerificationResult ¶

type VerificationResult struct {
	// TotalShards is the total number of shards.
	TotalShards int
	// AllGood specifies whether all chunks of all shards are readable and has no
	// issues.
	AllGood bool
	// FullyReadable specifies whether it is possible to fully read and/or recover
	// the file.
	FullyReadable bool
	// ByShard contains a breakdown of issues per shard.
	ByShard []ShardVerificationResult
	// IrrecoverableBlocks is a slice of block indices that have fewer healthy
	// shards than is required to recover.
	IrrecoverableBlocks []int
}

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
aes
cmd
stitch
stitch/cmd
header
reedsolomon
util
debug

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL