hashtree

package
v1.3.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 9, 2017 License: Apache-2.0 Imports: 10 Imported by: 0

README

This is a small library for working with modified Merkle Trees. We store one of these data structures in block storage (e.g. S3) for each PFS commit, so that we know, with each subsequent commit, what files changed and need to be reprocessed by any pipelines.

Documentation

Overview

Package hashtree is a generated protocol buffer package.

It is generated from these files:

server/pkg/hashtree/hashtree.proto

It has these top-level messages:

FileNodeProto
DirectoryNodeProto
NodeProto
HashTreeProto

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type DirectoryNodeProto

type DirectoryNodeProto struct {
	// Children of this directory. Note that paths are relative, so if "/foo/bar"
	// has a child "baz", that means that there is a file at "/foo/bar/baz".
	//
	// 'Children' is ordered alphabetically, to quickly check if a new file is
	// overwriting an existing one.
	Children []string `protobuf:"bytes,3,rep,name=children" json:"children,omitempty"`
}

DirectoryNodeProto is a node corresponding to a directory.

func (*DirectoryNodeProto) Descriptor

func (*DirectoryNodeProto) Descriptor() ([]byte, []int)

func (*DirectoryNodeProto) GetChildren

func (m *DirectoryNodeProto) GetChildren() []string

func (*DirectoryNodeProto) ProtoMessage

func (*DirectoryNodeProto) ProtoMessage()

func (*DirectoryNodeProto) Reset

func (m *DirectoryNodeProto) Reset()

func (*DirectoryNodeProto) String

func (m *DirectoryNodeProto) String() string

type ErrCode

type ErrCode uint8

ErrCode identifies different kinds of errors returned by methods in HashTree below. The ErrCode of any such error can be retrieved with Code().

const (
	// OK is returned on success
	OK ErrCode = iota

	// Unknown is returned by Code() when an error wasn't emitted by the HashTree
	// implementation.
	Unknown

	// Internal is returned when a HashTree encounters a bug (usually due to the
	// violation of an internal invariant).
	Internal

	// PathNotFound is returned when Get() or DeleteFile() is called with a path
	// that doesn't lead to a node.
	PathNotFound

	// MalformedGlob is returned when Glob() is called with an invalid glob
	// pattern.
	MalformedGlob

	// PathConflict is returned when a path that is expected to point to a
	// directory in fact points to a file, or the reverse. For example:
	// 1. PutFile is called with a path that points to a directory.
	// 2. PutFile is called with a path that contains a prefix that
	//    points to a file.
	// 3. Merge is forced to merge a directory into a file
	PathConflict
)

func Code

func Code(err error) ErrCode

Code returns the "error code" of 'err' if it was returned by one of the HashTree methods, or "Unknown" if 'err' was emitted by some other function (error codes are defined in interface.go)

type FileNodeProto

type FileNodeProto struct {
	// BlockRefs are references to the file's contents in block storage.
	// Naturally, the blocks are ordered with respect to their position in the
	// file (block with initial content is first)
	BlockRefs []*pfs.BlockRef `protobuf:"bytes,3,rep,name=block_refs,json=blockRefs" json:"block_refs,omitempty"`
}

FileNodeProto is a node corresponding to a file (which is also a leaf node).

func (*FileNodeProto) Descriptor

func (*FileNodeProto) Descriptor() ([]byte, []int)

func (*FileNodeProto) GetBlockRefs

func (m *FileNodeProto) GetBlockRefs() []*pfs.BlockRef

func (*FileNodeProto) ProtoMessage

func (*FileNodeProto) ProtoMessage()

func (*FileNodeProto) Reset

func (m *FileNodeProto) Reset()

func (*FileNodeProto) String

func (m *FileNodeProto) String() string

type HashTree

type HashTree interface {
	// PutFile appends data to a file (and creates the file if it doesn't exist)
	PutFile(path string, blockRefs []*pfs.BlockRef) error

	// PutDir creates a directory (or does nothing if one exists)
	PutDir(path string) error

	// DeleteFile deletes a regular file or directory (along with its children).
	DeleteFile(path string) error

	// Get retrieves the contents of a regular file
	Get(path string) (*NodeProto, error)

	// List retrieves the list of files and subdirectories of the directory at
	// 'path'.
	List(path string) ([]*NodeProto, error)

	// Glob returns a list of files and directories that match 'pattern'
	Glob(pattern string) ([]*NodeProto, error)

	// Merge adds all of the files and directories in each tree in 'trees' into
	// this tree. The effect is equivalent to calling this.PutFile with every
	// file in every tree in 'tree', though the performance may be slightly
	// better.
	Merge(trees []HashTree) error
}

HashTree is the signature of a hash tree provided by this library

type HashTreeProto

type HashTreeProto struct {
	// Version is an arbitrary version number, set by the corresponding library
	// in hashtree.go.  This ensures that if the hash function used to create
	// these trees is changed, we won't run into errors when deserializing old
	// trees. The current version is 1.
	Version int32 `protobuf:"varint,1,opt,name=version,proto3" json:"version,omitempty"`
	// Fs maps each node's path to the NodeProto with that node's details.
	// See "Potential Optimizations" at the end for a compression scheme that
	// could be useful if this map gets too large.
	//
	// Note that the key must end in "/" if an only if the value has .dir_node set
	// (i.e. iff the path points to a directory).
	Fs map[string]*NodeProto `` /* 131-byte string literal not displayed */
}

HashTreeProto is a tree corresponding to the complete file contents of a pachyderm repo at a given commit (based on a Merkle Tree). We store one HashTree for every PFS commit.

func (*HashTreeProto) DeleteFile

func (h *HashTreeProto) DeleteFile(path string) error

DeleteFile deletes the file at 'path', and all children recursively if 'path' is a subdirectory

func (*HashTreeProto) Descriptor

func (*HashTreeProto) Descriptor() ([]byte, []int)

func (*HashTreeProto) Get

func (h *HashTreeProto) Get(path string) (*NodeProto, error)

Get returns the node associated with the path

func (*HashTreeProto) GetFs

func (m *HashTreeProto) GetFs() map[string]*NodeProto

func (*HashTreeProto) GetVersion

func (m *HashTreeProto) GetVersion() int32

func (*HashTreeProto) Glob

func (h *HashTreeProto) Glob(pattern string) ([]*NodeProto, error)

Glob beturns a list of nodes that match 'pattern'.

func (*HashTreeProto) List

func (h *HashTreeProto) List(path string) ([]*NodeProto, error)

List returns the NodeProtos corresponding to the files and directories under 'path'

func (*HashTreeProto) Merge

func (h *HashTreeProto) Merge(trees []HashTree) error

Merge merges the HashTrees in 'trees' into 'h'. The result is nil if no errors are encountered while merging any tree, or else a new error e, where: - Code(e) is the error code of the first error encountered - e.Error() contains the error messages of the first 10 errors encountered

func (*HashTreeProto) ProtoMessage

func (*HashTreeProto) ProtoMessage()

func (*HashTreeProto) PutDir

func (h *HashTreeProto) PutDir(path string) error

PutDir inserts an empty directory into the hierarchy

func (*HashTreeProto) PutFile

func (h *HashTreeProto) PutFile(path string, blockRefs []*pfs.BlockRef) error

PutFile inserts a file into the hierarchy

func (*HashTreeProto) Reset

func (m *HashTreeProto) Reset()

func (*HashTreeProto) String

func (m *HashTreeProto) String() string

type NodeProto

type NodeProto struct {
	// Name is the name (not path) of the file/directory (e.g. /lib).
	Name string `protobuf:"bytes,1,opt,name=name,proto3" json:"name,omitempty"`
	// Hash is a hash of the node's name and contents (which includes the
	// BlockRefs of a file and the Children of a directory). This can be used to
	// detect if the name or contents have changed between versions.
	Hash []byte `protobuf:"bytes,2,opt,name=hash,proto3" json:"hash,omitempty"`
	// subtree_size is the of the subtree under node; i.e. if this is a directory,
	// subtree_size includes all children.
	SubtreeSize int64 `protobuf:"varint,3,opt,name=subtree_size,json=subtreeSize,proto3" json:"subtree_size,omitempty"`
	// Exactly one of the following fields must be set. The type of this node will
	// be determined by which field is set.
	FileNode *FileNodeProto      `protobuf:"bytes,4,opt,name=file_node,json=fileNode" json:"file_node,omitempty"`
	DirNode  *DirectoryNodeProto `protobuf:"bytes,5,opt,name=dir_node,json=dirNode" json:"dir_node,omitempty"`
}

NodeProto is a node in the file tree (either a file or a directory)

func (*NodeProto) Descriptor

func (*NodeProto) Descriptor() ([]byte, []int)

func (*NodeProto) GetDirNode

func (m *NodeProto) GetDirNode() *DirectoryNodeProto

func (*NodeProto) GetFileNode

func (m *NodeProto) GetFileNode() *FileNodeProto

func (*NodeProto) GetHash

func (m *NodeProto) GetHash() []byte

func (*NodeProto) GetName

func (m *NodeProto) GetName() string

func (*NodeProto) GetSubtreeSize

func (m *NodeProto) GetSubtreeSize() int64

func (*NodeProto) ProtoMessage

func (*NodeProto) ProtoMessage()

func (*NodeProto) Reset

func (m *NodeProto) Reset()

func (*NodeProto) String

func (m *NodeProto) String() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL