stitch

package module
v0.0.1-demo Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 20, 2022 License: MPL-2.0 Imports: 17 Imported by: 0

README

stitch 🩹

GoDoc GitHub Workflow Status

Compress, encrypt, and split files into pieces. Then stitch them together again.

# Building
make

# Testing
make test

# View documentation
make doc

Once you've executed make doc, you can view the documentation with your browser at http://localhost:6060/pkg/github.com/OhanaFS/stitch/

How it works

pipeline

Use the command-line interface

Currently there is a basic CLI for the pipeline encoder:

go run ./cmd/stitch pipeline --help

To encode files, use the -input flag:

go run ./cmd/stitch pipeline -input file.bin

The command will create file.bin.shardX files in the same directory. To decode, use the -output flag:

go run ./cmd/stitch pipeline -output file.bin

The command will look for file.bin.shardX files and use it to reconstruct file.bin.

Documentation

Overview

Stitch is a tool to compress, encrypt, and split any data into a set of shards.

Example

A simple example to demonstrate how to use the Encoder and ReadSeeker.

package main

import (
	"fmt"
	"io"
	"os"

	"github.com/OhanaFS/stitch"
)

func main() {
	// Create a new encoder.
	encoder := stitch.NewEncoder(&stitch.EncoderOptions{
		DataShards:   2,
		ParityShards: 1,
		KeyThreshold: 2,
	})

	// Open the input file.
	input, _ := os.Open("input.txt")
	defer input.Close()

	// Open the output files.
	out1, _ := os.Create("output.shard1")
	defer out1.Close()
	out2, _ := os.Create("output.shard2")
	defer out2.Close()

	// Use a dummy key and IV.
	key := []byte("00000000000000000000000000000000")
	iv := []byte("000000000000")

	// Encode the data.
	result, _ := encoder.Encode(input, []io.Writer{out1, out2}, key, iv)
	fmt.Printf("File size: %d\n", result.FileSize)
	fmt.Printf("File hash: %x\n", result.FileHash)

	// Decode the data.
	reader, _ := encoder.NewReadSeeker([]io.ReadSeeker{out1, out2}, key, iv)
	io.Copy(os.Stdout, reader)
}
Output:

Index

Examples

Constants

This section is empty.

Variables

View Source
var (
	ErrShardCountMismatch = errors.New("shard count mismatch")
	ErrNonSeekableWriter  = errors.New("shards must support seeking")
	ErrNotEnoughKeyShards = errors.New("not enough shards to reconstruct the file key")
	ErrNotEnoughShards    = errors.New("not enough shards to reconstruct the file")
	ErrNoCompleteHeader   = errors.New("no complete header found")
)

Functions

This section is empty.

Types

type Encoder

type Encoder struct {
	// contains filtered or unexported fields
}

Encoder takes in a stream of data and shards it into a specified number of data and parity shards. It includes compression using zstd, encryption using AES-GCM, and splitting the data into equal-sized shards using Reed-Solomon.

It follows this process to encode the data into multiple shards:

  1. Generate a random key Kr
  2. Generate N output streams So_n
  3. Generate a file header
  4. Encrypt Kr with user-supplied key Ku, and embed it into the file header
  5. Write the header to So_n
  6. Take a byte stream of user-supplied data Sd and pipe it to the compressor C
  7. Pipe the output of C into a streaming symmetric encryption method E, which uses Kr to encrypt
  8. Pipe the output of E into Reed-Solomon encoder to get N output streams RS_n
  9. Pipe the output of RS_n to So_n

func NewEncoder

func NewEncoder(opts *EncoderOptions) *Encoder

func (*Encoder) Encode

func (e *Encoder) Encode(data io.Reader, shards []io.Writer, key []byte, iv []byte) (*EncodingResult, error)

Encode takes in a reader, performs the transformations and then splits the data into multiple shards, writing them to the output writers. The output writers are not closed after the data is written.

After the data has finished encoding, a header will be written to the end of each shard. At this point, the shards are not usable yet until the header is finalized using the FinalizeHeader() function.

func (*Encoder) FinalizeHeader

func (e *Encoder) FinalizeHeader(shard io.ReadWriteSeeker) error

FinalizeHeader rewrites the shard header with the one located at the end of the shard. If the provided shard is an *os.File, the header at the end of the file will be truncated.

func (*Encoder) NewReadSeeker

func (e *Encoder) NewReadSeeker(shards []io.ReadSeeker, key []byte, iv []byte) (
	io.ReadSeeker, error,
)

NewReadSeeker returns a new ReadSeeker that can be used to access the data contained within the shards.

func (*Encoder) RotateKeys

func (e *Encoder) RotateKeys(shards []io.ReadSeeker,
	previousKey, previousIv, newKey, newIv []byte) ([][]byte, error)

RotateKeys reads the header from the supplied shards, reconstructs the file key, and then decrypts it with the supplied key and iv. It will then re-encrypt it with the new key and iv, and split them with Shamir's Secret Sharing Scheme. The resulting key splits are then returned.

The caller must then use the UpdateShardKey() function to update each shard's header to use the new key splits.

func (*Encoder) UpdateShardKey

func (*Encoder) UpdateShardKey(shard io.ReadWriteSeeker, newKeySplit []byte) error

UpdateShardKey updates the header of the supplied shard with the new key split. The header is then written to the shard. To obtain a new key split, use the RotateKeys() function.

func (*Encoder) VerifyIntegrity

func (e *Encoder) VerifyIntegrity(shards []io.ReadSeeker) (*VerificationResult, error)

VerifyIntegrity tries to read and verify the integrity of all the provided shards. An error is returned if it is not possible to recover the original file.

type EncoderOptions

type EncoderOptions struct {
	// DataShards is the total number of shards to split data into.
	DataShards uint8
	// ParityShards is the total number of parity shards to create. This also
	// determines the maximum number of shards that can be lost before the data
	// cannot be recovered.
	ParityShards uint8
	// KeyThreshold is the minimum number of shards required to reconstruct the
	// key used to encrypt the data.
	KeyThreshold uint8
}

EncoderOptions specifies options for the Encoder.

type EncodingResult

type EncodingResult struct {
	// FileSize is the size of the input file in bytes.
	FileSize uint64
	// FileHash is the SHA256 hash of the input file.
	FileHash []byte
}

type ShardVerificationResult

type ShardVerificationResult struct {
	// IsAvailable specifies whether the shard is readable at all.
	IsAvailable bool
	// IsHeaderComplete specifies whether the header in the shard is marked as
	// complete. An incomplete header indicates either a corrupt header, or a
	// shard that hasn't been finalized.
	IsHeaderComplete bool
	// ShardIndex is the index of the shard as specified in the header.
	ShardIndex int
	// BlocksCount is the number of blocks that are supposed to be in the file as
	// calculated by the shard's header
	BlocksCount int
	// BlocksFound is the total number of blocks actually found in the shard.
	BlocksFound int
	// BrokenBlocks is a slice of block indices that are corrupted, starting from
	// zero.
	BrokenBlocks []int
}

func VerifyShardIntegrity

func VerifyShardIntegrity(shard io.Reader) (*ShardVerificationResult, error)

VerifyShardIntegrity tries to read through an entire shard, and report back any issues. If the shard is unreadable, an error will be returned.

type VerificationResult

type VerificationResult struct {
	// TotalShards is the total number of shards.
	TotalShards int
	// AllGood specifies whether all chunks of all shards are readable and has no
	// issues.
	AllGood bool
	// FullyReadable specifies whether it is possible to fully read and/or recover
	// the file.
	FullyReadable bool
	// ByShard contains a breakdown of issues per shard.
	ByShard []ShardVerificationResult
	// IrrecoverableBlocks is a slice of block indices that have fewer healthy
	// shards than is required to recover.
	IrrecoverableBlocks []int
}

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL