hashtree

package
v1.4.0-RC1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 9, 2017 License: Apache-2.0 Imports: 9 Imported by: 0

README

This is a small library for working with modified Merkle Trees. We store one of these data structures in block storage (e.g. S3) for each PFS commit, so that we know, with each subsequent commit, what files changed and need to be reprocessed by any pipelines.

Documentation

Overview

Package hashtree is a generated protocol buffer package.

It is generated from these files:

server/pkg/hashtree/hashtree.proto

It has these top-level messages:

FileNodeProto
DirectoryNodeProto
NodeProto
HashTreeProto

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Serialize added in v1.3.6

func Serialize(h HashTree) ([]byte, error)

Serialize serializes a HashTree so that it can be persisted. Also see Deserialize(bytes).

Types

type DirectoryNodeProto

type DirectoryNodeProto struct {
	// Children of this directory. Note that paths are relative, so if "/foo/bar"
	// has a child "baz", that means that there is a file at "/foo/bar/baz".
	//
	// 'Children' is ordered alphabetically, to quickly check if a new file is
	// overwriting an existing one.
	Children []string `protobuf:"bytes,3,rep,name=children" json:"children,omitempty"`
}

DirectoryNodeProto is a node corresponding to a directory.

func (*DirectoryNodeProto) Descriptor

func (*DirectoryNodeProto) Descriptor() ([]byte, []int)

func (*DirectoryNodeProto) GetChildren

func (m *DirectoryNodeProto) GetChildren() []string

func (*DirectoryNodeProto) ProtoMessage

func (*DirectoryNodeProto) ProtoMessage()

func (*DirectoryNodeProto) Reset

func (m *DirectoryNodeProto) Reset()

func (*DirectoryNodeProto) String

func (m *DirectoryNodeProto) String() string

type ErrCode

type ErrCode uint8

ErrCode identifies different kinds of errors returned by methods in HashTree below. The ErrCode of any such error can be retrieved with Code().

const (
	// OK is returned on success
	OK ErrCode = iota

	// Unknown is returned by Code() when an error wasn't emitted by the HashTree
	// implementation.
	Unknown

	// Internal is returned when a HashTree encounters a bug (usually due to the
	// violation of an internal invariant).
	Internal

	// CannotDeserialize is returned when Deserialize(bytes) fails, perhaps due to
	// 'bytes' being corrupted.
	CannotDeserialize

	// Unsupported is returned when Deserialize(bytes) encounters an unsupported
	// (likely old) serialized HashTree.
	Unsupported

	// PathNotFound is returned when Get() or DeleteFile() is called with a path
	// that doesn't lead to a node.
	PathNotFound

	// MalformedGlob is returned when Glob() is called with an invalid glob
	// pattern.
	MalformedGlob

	// PathConflict is returned when a path that is expected to point to a
	// directory in fact points to a file, or the reverse. For example:
	// 1. PutFile is called with a path that points to a directory.
	// 2. PutFile is called with a path that contains a prefix that
	//    points to a file.
	// 3. Merge is forced to merge a directory into a file
	PathConflict
)

func Code

func Code(err error) ErrCode

Code returns the "error code" of 'err' if it was returned by one of the HashTree methods, or "Unknown" if 'err' was emitted by some other function (error codes are defined in interface.go)

type FileNodeProto

type FileNodeProto struct {
	// BlockRefs are references to the file's contents in block storage.
	// Naturally, the blocks are ordered with respect to their position in the
	// file (block with initial content is first)
	BlockRefs []*pfs.BlockRef `protobuf:"bytes,3,rep,name=block_refs,json=blockRefs" json:"block_refs,omitempty"`
}

FileNodeProto is a node corresponding to a file (which is also a leaf node).

func (*FileNodeProto) Descriptor

func (*FileNodeProto) Descriptor() ([]byte, []int)

func (*FileNodeProto) GetBlockRefs

func (m *FileNodeProto) GetBlockRefs() []*pfs.BlockRef

func (*FileNodeProto) ProtoMessage

func (*FileNodeProto) ProtoMessage()

func (*FileNodeProto) Reset

func (m *FileNodeProto) Reset()

func (*FileNodeProto) String

func (m *FileNodeProto) String() string

type HashTree

type HashTree interface {
	// Open makes a deep copy of the HashTree and returns the copy
	Open() OpenHashTree

	// Get retrieves a file.
	Get(path string) (*NodeProto, error)

	// List retrieves the list of files and subdirectories of the directory at
	// 'path'.
	List(path string) ([]*NodeProto, error)

	// Glob returns a list of files and directories that match 'pattern'.
	Glob(pattern string) ([]*NodeProto, error)

	// Size gets the size of the file system that this tree represents.
	// It's essentially a helper around h.Get("/").SubtreeBytes
	Size() int64
}

HashTree is the signature of a hash tree provided by this library. To get a new HashTree, create an OpenHashTree with NewHashTree(), modify it, and then call Finish() on it.

func Deserialize added in v1.3.6

func Deserialize(serialized []byte) (HashTree, error)

Deserialize deserializes a hash tree so that it can be read or modified.

type HashTreeProto

type HashTreeProto struct {
	// Version is an arbitrary version number, set by the corresponding library
	// in hashtree.go.  This ensures that if the hash function used to create
	// these trees is changed, we won't run into errors when deserializing old
	// trees. The current version is 1.
	Version int32 `protobuf:"varint,1,opt,name=version,proto3" json:"version,omitempty"`
	// Fs maps each node's path to the NodeProto with that node's details.
	// See "Potential Optimizations" at the end for a compression scheme that
	// could be useful if this map gets too large.
	//
	// Note that the key must end in "/" if an only if the value has .dir_node set
	// (i.e. iff the path points to a directory).
	Fs map[string]*NodeProto `` /* 131-byte string literal not displayed */
}

HashTreeProto is a tree corresponding to the complete file contents of a pachyderm repo at a given commit (based on a Merkle Tree). We store one HashTree for every PFS commit.

func (*HashTreeProto) Descriptor

func (*HashTreeProto) Descriptor() ([]byte, []int)

func (*HashTreeProto) Get

func (h *HashTreeProto) Get(path string) (*NodeProto, error)

Get retrieves the contents of a file.

func (*HashTreeProto) GetFs

func (m *HashTreeProto) GetFs() map[string]*NodeProto

func (*HashTreeProto) GetVersion

func (m *HashTreeProto) GetVersion() int32

func (*HashTreeProto) Glob

func (h *HashTreeProto) Glob(pattern string) ([]*NodeProto, error)

Glob returns a list of files and directories that match 'pattern'. The nodes returned have their `Name`s set to their full paths.

func (*HashTreeProto) List

func (h *HashTreeProto) List(path string) ([]*NodeProto, error)

List retrieves the list of files and subdirectories of the directory at 'path'.

func (*HashTreeProto) Open added in v1.3.6

func (h *HashTreeProto) Open() OpenHashTree

Open makes a deep copy of the HashTree and returns the copy

func (*HashTreeProto) ProtoMessage

func (*HashTreeProto) ProtoMessage()

func (*HashTreeProto) Reset

func (m *HashTreeProto) Reset()

func (*HashTreeProto) Size added in v1.3.19

func (h *HashTreeProto) Size() int64

Size returns the size of the file system that the hashtree represents.

func (*HashTreeProto) String

func (m *HashTreeProto) String() string

type NodeProto

type NodeProto struct {
	// Name is the name (not path) of the file/directory (e.g. /lib).
	Name string `protobuf:"bytes,1,opt,name=name,proto3" json:"name,omitempty"`
	// Hash is a hash of the node's name and contents (which includes the
	// BlockRefs of a file and the Children of a directory). This can be used to
	// detect if the name or contents have changed between versions.
	Hash []byte `protobuf:"bytes,2,opt,name=hash,proto3" json:"hash,omitempty"`
	// subtree_size is the of the subtree under node; i.e. if this is a directory,
	// subtree_size includes all children.
	SubtreeSize int64 `protobuf:"varint,3,opt,name=subtree_size,json=subtreeSize,proto3" json:"subtree_size,omitempty"`
	// Exactly one of the following fields must be set. The type of this node will
	// be determined by which field is set.
	FileNode *FileNodeProto      `protobuf:"bytes,4,opt,name=file_node,json=fileNode" json:"file_node,omitempty"`
	DirNode  *DirectoryNodeProto `protobuf:"bytes,5,opt,name=dir_node,json=dirNode" json:"dir_node,omitempty"`
}

NodeProto is a node in the file tree (either a file or a directory)

func (*NodeProto) Descriptor

func (*NodeProto) Descriptor() ([]byte, []int)

func (*NodeProto) GetDirNode

func (m *NodeProto) GetDirNode() *DirectoryNodeProto

func (*NodeProto) GetFileNode

func (m *NodeProto) GetFileNode() *FileNodeProto

func (*NodeProto) GetHash

func (m *NodeProto) GetHash() []byte

func (*NodeProto) GetName

func (m *NodeProto) GetName() string

func (*NodeProto) GetSubtreeSize

func (m *NodeProto) GetSubtreeSize() int64

func (*NodeProto) ProtoMessage

func (*NodeProto) ProtoMessage()

func (*NodeProto) Reset

func (m *NodeProto) Reset()

func (*NodeProto) String

func (m *NodeProto) String() string

type OpenHashTree added in v1.3.6

type OpenHashTree interface {
	// GetOpen retrieves a file.
	GetOpen(path string) (*OpenNode, error)

	// PutFile appends data to a file (and creates the file if it doesn't exist).
	PutFile(path string, blockRefs []*pfs.BlockRef) error

	// PutDir creates a directory (or does nothing if one exists).
	PutDir(path string) error

	// DeleteFile deletes a regular file or directory (along with its children).
	DeleteFile(path string) error

	// Merge adds all of the files and directories in each tree in 'trees' into
	// this tree.
	Merge(trees ...HashTree) error

	// Finish makes a deep copy of the OpenHashTree, updates all of the hashes and
	// node size metadata in the copy, and returns the copy
	Finish() (HashTree, error)
}

OpenHashTree is like HashTree, except that it can be modified. Once an OpenHashTree is Finish()ed, the hash and size stored with each node will be updated (until then, the hashes and sizes stored in an OpenHashTree will be stale).

func NewHashTree added in v1.3.6

func NewHashTree() OpenHashTree

NewHashTree creates a new hash tree implementing Interface.

type OpenNode added in v1.3.6

type OpenNode struct {
	Name string

	FileNode *FileNodeProto
	DirNode  *DirectoryNodeProto
}

OpenNode is similar to NodeProto, except that it doesn't include the Hash or Size fields (which are not generally meaningful in an OpenHashTree)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL