bsdiff

package
v0.0.0-...-189a019 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 24, 2022 License: MIT, MIT Imports: 14 Imported by: 5

README

bsdiff

Benchmarking

To run the benchmarks, you'll need to grab the sample data files from this page: https://www.cs.princeton.edu/~rs/strings/

You can run ./grab_testdata.sh to download them on a system that has sh and curl.

Documentation

Overview

Package bsdiff is a generated protocol buffer package.

It is generated from these files:

bsdiff/bsdiff.proto

It has these top-level messages:

Control

Index

Constants

View Source
const Cutoff = 16
View Source
const MaxFileSize = int64(math.MaxInt32 - 1)

MaxFileSize is the largest size bsdiff will diff (for both old and new file): 2GB - 1 bytes a different codepath could be used for larger files, at the cost of unreasonable memory usage (even in 2016). If a big corporate user is willing to sponsor that part of the code, get in touch! Fair warning though: it won't be covered, our CI workers don't have that much RAM :)

View Source
const MaxMessageSize int64 = 16 * 1024 * 1024

MaxMessageSize is the maximum amount of bytes that will be stored in a protobuf message generated by bsdiff. This enable friendlier streaming apply at a small storage cost TODO: actually use

View Source
const SelectionSortThreshold = 16

Variables

View Source
var ErrCorrupt = errors.New("corrupt patch")

ErrCorrupt indicates that a patch is corrupted, most often that it would produce a longer file than specified

Functions

This section is empty.

Types

type AdderReader

type AdderReader struct {
	Buffer []byte
	Reader io.Reader
	// contains filtered or unexported fields
}

func (*AdderReader) Read

func (ar *AdderReader) Read(p []byte) (int, error)

type BucketGroup

type BucketGroup struct {
	// contains filtered or unexported fields
}

type Control

type Control struct {
	Add  []byte `protobuf:"bytes,1,opt,name=add,proto3" json:"add,omitempty"`
	Copy []byte `protobuf:"bytes,2,opt,name=copy,proto3" json:"copy,omitempty"`
	Seek int64  `protobuf:"varint,3,opt,name=seek" json:"seek,omitempty"`
	Eof  bool   `protobuf:"varint,4,opt,name=eof" json:"eof,omitempty"`
}

Control is a bsdiff operation, see https://twitter.com/fasterthanlime/status/790617515009437701

func (*Control) Descriptor

func (*Control) Descriptor() ([]byte, []int)

func (*Control) ProtoMessage

func (*Control) ProtoMessage()

func (*Control) Reset

func (m *Control) Reset()

func (*Control) String

func (m *Control) String() string

type DiffContext

type DiffContext struct {
	// SuffixSortConcurrency specifies the number of workers to use for suffix sorting.
	// Exceeding the number of cores will only slow it down. A 0 value (default) uses
	// sequential suffix sorting, which uses less RAM and has less overhead (might be faster
	// in some scenarios). A negative value means (number of cores - value).
	SuffixSortConcurrency int

	// number of partitions into which to separate the input data, sort concurrently
	// and scan in concurrently
	Partitions int

	// MeasureMem enables printing memory usage statistics at various points in the
	// diffing process.
	MeasureMem bool

	// MeasureParallelOverhead prints some stats on the overhead of parallel suffix sorting
	MeasureParallelOverhead bool

	Stats *DiffStats

	I []int
	// contains filtered or unexported fields
}

DiffContext holds settings for the diff process, along with some internal storage: re-using a diff context is good to avoid GC thrashing (but never do it concurrently!)

func (*DiffContext) Do

func (ctx *DiffContext) Do(old, new io.Reader, writeMessage WriteMessageFunc, consumer *state.Consumer) error

Do computes the difference between old and new, according to the bsdiff algorithm, and writes the result to patch.

type DiffStats

type DiffStats struct {
	TimeSpentSorting  time.Duration
	TimeSpentScanning time.Duration
	BiggestAdd        int64
}

type IndividualPatchContext

type IndividualPatchContext struct {
	OldOffset int64
	// contains filtered or unexported fields
}

func (*IndividualPatchContext) Apply

func (ipc *IndividualPatchContext) Apply(ctrl *Control) error

type Match

type Match struct {
	// contains filtered or unexported fields
}

A Match is a pair of two regions from the old and new file that have been selected by the bsdiff algorithm for subtraction.

type PSA

type PSA struct {
	I []int
	// contains filtered or unexported fields
}

Partitioned suffix array

func NewPSA

func NewPSA(p int, buf []byte, I []int) *PSA

type PatchContext

type PatchContext struct {
	// contains filtered or unexported fields
}

func NewPatchContext

func NewPatchContext() *PatchContext

func (*PatchContext) NewIndividualPatchContext

func (ctx *PatchContext) NewIndividualPatchContext(old io.ReadSeeker, oldOffset int64, out io.Writer) (*IndividualPatchContext, error)

func (*PatchContext) Patch

func (ctx *PatchContext) Patch(old io.ReadSeeker, out io.Writer, newSize int64, readMessage ReadMessageFunc) error

Patch applies patch to old, according to the bspatch algorithm, and writes the result to new.

type ReadMessageFunc

type ReadMessageFunc func(msg proto.Message) error

ReadMessageFunc should read the passed protobuf and relay any errors. See the `wire` package for an example implementation.

type SuffixArrayZ

type SuffixArrayZ struct {
	// contains filtered or unexported fields
}

func NewSuffixArrayZ

func NewSuffixArrayZ(input []byte) *SuffixArrayZ

type WriteMessageFunc

type WriteMessageFunc func(msg proto.Message) (err error)

WriteMessageFunc should write a given protobuf message and relay any errors No reference to the given message can be kept, as its content may be modified after WriteMessageFunc returns. See the `wire` package for an example implementation.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL