disksort

package
v0.0.17 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 16, 2015 License: Apache-2.0 Imports: 12 Imported by: 3

Documentation

Overview

Package disksort implements sorting algorithms for sets of data too large to fit fully in-memory. If the number of elements becomes to large, data are paged onto the disk.

Index

Constants

View Source
const DefaultIOBufferSize = 2 << 13

DefaultIOBufferSize is the default size of the reading/writing buffers for the temporary file shards.

View Source
const DefaultMaxInMemory = 32000

DefaultMaxInMemory is the default number of elements to keep in-memory during a merge sort.

Variables

View Source
var (
	// ErrAlreadyFinalized is returned from Interface#Add when Interface#Read has
	// already been called, freezing the sort's inputs.
	ErrAlreadyFinalized = errors.New("sorter already finalized")

	// ErrAlreadyRead is returned from Interface#Read when Interface#Read has
	// already been called.
	ErrAlreadyRead = errors.New("sorter already read")
)

Functions

This section is empty.

Types

type Interface

type Interface interface {
	// Add adds a new element to the set of data to be sorted.
	Add(i interface{}) error

	// Read calls f on each elements previously added the set of data to be
	// sorted.  If f returns an error, it is returned immediately and f is no
	// longer called.  Once Read is called, no more data may be added to the
	// sorter.  Read may only be called once.
	Read(f func(interface{}) error) error
}

Interface is the standard interface for disk sorting algorithms. Each element in the set of data to be sorted is added to the sorter with Add. Once all elements are added, Read can then be called to retrieve each element sequentially in sorted order. Once Read is called, no other operations on the sorter are allowed.

func NewMergeSorter

func NewMergeSorter(opts MergeOptions) (Interface, error)

NewMergeSorter returns a new disk sorter using a mergesort algorithm.

type Marshaler

type Marshaler interface {
	// Marshal binary encodes the given element.
	Marshal(interface{}) ([]byte, error)

	// Unmarshal decodes the given encoding of an element.
	Unmarshal([]byte) (interface{}, error)
}

Marshaler is an interface to functions that can binary encode/decode elements.

type MergeOptions

type MergeOptions struct {
	// Lesser is the comparison function for sorting the given elements.
	Lesser sortutil.Lesser
	// Marshaler is used for encoding/decoding elements in temporary file shards.
	Marshaler Marshaler

	// WorkDir is the directory used for writing temporary file shards.  If empty,
	// the default directory for temporary files is used.
	WorkDir string

	// MaxInMemory is the maximum number of elements to keep in-memory before
	// paging them to a temporary file shard.  If non-positive, DefaultMaxInMemory
	// is used.
	MaxInMemory int

	// CompressShards determines whether the temporary file shards should be
	// compressed.
	CompressShards bool

	// IOBufferSize is the size of the reading/writing buffers for the temporary
	// file shards.  If non-positive, DefaultIOBufferSize is used.
	IOBufferSize int
}

MergeOptions specifies how to sort elements.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL