record

package module
v0.0.0-...-f1629b9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 5, 2023 License: BSD-3-Clause Imports: 5 Imported by: 0

README

This is a fork of github.com/golang/leveldb. I've pulled out the subdirectory record package and replaced the rest of repository with that forked copy. It was then modified to change the wire format of records to support 2MiB chunks (instead of 32KiB chunks) and add the ability to append to one of these record files. record_test.go was updated to reflect these changes as well.

-- Mike Wiacek 15 Jan 2020


WARNING: This is an incomplete work-in-progress.

It is not ready for production use. Some features aren't implemented yet. Documentation is missing.

The LevelDB key-value database in the Go programming language.

To download and install from source: $ go get github.com/golang/leveldb

Unless otherwise noted, the LevelDB-Go source files are distributed under the BSD-style license found in the LICENSE file.

Documentation

Overview

Package record reads and writes sequences of records. Each record is a stream of bytes that completes before the next record starts.

When reading, call Next to obtain an io.Reader for the next record. Next will return io.EOF when there are no more records. It is valid to call Next without reading the current record to exhaustion.

When writing, call Next to obtain an io.Writer for the next record. Calling Next finishes the current record. Call Close to finish the final record.

Optionally, call Flush to finish the current record and flush the underlying writer without starting a new record. To start a new record after flushing, call Next.

Neither Readers or Writers are safe to use concurrently.

Example code:

func read(r io.Reader) ([]string, error) {
	var ss []string
	records := record.NewReader(r)
	for {
		rec, err := records.Next()
		if err == io.EOF {
			break
		}
		if err != nil {
			log.Printf("recovering from %v", err)
			r.Recover()
			continue
		}
		s, err := ioutil.ReadAll(rec)
		if err != nil {
			log.Printf("recovering from %v", err)
			r.Recover()
			continue
		}
		ss = append(ss, string(s))
	}
	return ss, nil
}

func write(w io.Writer, ss []string) error {
	records := record.NewWriter(w)
	for _, s := range ss {
		rec, err := records.Next()
		if err != nil {
			return err
		}
		if _, err := rec.Write([]byte(s)), err != nil {
			return err
		}
	}
	return records.Close()
}

The wire format is that the stream is divided into 2MiB blocks, and each block contains a number of tightly packed chunks. Chunks cannot cross block boundaries. The last block may be shorter than 2 MiB. Any unused bytes in a block must be zero.

A record maps to one or more chunks. Each chunk has a 8 byte header (a 4 byte checksum, a 3 byte little-endian uint24 length, and a 1 byte chunk type) followed by a payload. The checksum is over the chunk type and the payload.

There are four chunk types: whether the chunk is the full record, or the first, middle or last chunk of a multi-chunk record. A multi-chunk record has one first chunk, zero or more middle chunks, and one last chunk.

The wire format allows for limited recovery in the face of data corruption: on a format error (such as a checksum mismatch), the reader moves to the next block and looks for the next full or first chunk.

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrNotAnIOSeeker is returned if the io.Reader underlying a Reader does not implement io.Seeker.
	ErrNotAnIOSeeker = errors.New("record: reader does not implement io.Seeker")

	// ErrNoLastRecord is returned if LastRecordOffset is called and there is no previous record.
	ErrNoLastRecord = errors.New("record: no last record exists")

	// ErrBlockAppearsZeroed is returned if a call to read data find zero'ed data in chunk header.
	ErrBlockAppearsZeroed = errors.New("record: block appears to be zeroed")
)

Functions

This section is empty.

Types

type Reader

type Reader struct {
	// contains filtered or unexported fields
}

Reader reads records from an underlying io.Reader.

func NewReader

func NewReader(r io.Reader) *Reader

NewReader returns a new reader.

func (*Reader) CurrentRecordOffset

func (r *Reader) CurrentRecordOffset() int64

CurrentRecordOffset returns the offset in the underlying file where the last record returned by Next begins. This is the value that you would call SeekRecord for to start from that record. If SeekRecord was called, but Next has not been called, the value returned here is -1.

func (*Reader) Next

func (r *Reader) Next() (io.Reader, error)

Next returns a reader for the next record. It returns io.EOF if there are no more records. The reader returned becomes stale after the next Next call, and should no longer be used.

func (*Reader) Recover

func (r *Reader) Recover()

Recover clears any errors read so far, so that calling Next will start reading from the next good 2MiB block. If there are no such blocks, Next will return io.EOF. Recover also marks the current reader, the one most recently returned by Next, as stale. If Recover is called without any prior error, then Recover is a no-op.

func (*Reader) SeekRecord

func (r *Reader) SeekRecord(offset int64) error

SeekRecord seeks in the underlying io.Reader such that calling r.Next returns the record whose first chunk header starts at the provided offset. Its behavior is undefined if the argument given is not such an offset, as the bytes at that offset may coincidentally appear to be a valid header.

It returns ErrNotAnIOSeeker if the underlying io.Reader does not implement io.Seeker.

SeekRecord will fail and return an error if the Reader previously encountered an error, including io.EOF. Such errors can be cleared by calling Recover. Calling SeekRecord after Recover will make calling Next return the record at the given offset, instead of the record at the next good 2MiB block as Recover normally would. Calling SeekRecord before Recover has no effect on Recover's semantics other than changing the starting point for determining the next good 2MiB block.

The offset is always relative to the start of the underlying io.Reader, so negative values will result in an error as per io.Seeker.

type Writer

type Writer struct {
	// contains filtered or unexported fields
}

Writer writes records to an underlying io.Writer.

func NewAppendWriterAtOffset

func NewAppendWriterAtOffset(w io.ReadWriteSeeker, offset int64) (*Writer, error)

func NewWriter

func NewWriter(w io.Writer) *Writer

NewWriter returns a new Writer.

func (*Writer) Close

func (w *Writer) Close() error

Close finishes the current record and closes the writer.

func (*Writer) Flush

func (w *Writer) Flush() error

Flush finishes the current record, writes to the underlying writer, and flushes it if that writer implements interface{ Flush() error }.

func (*Writer) LastRecordOffset

func (w *Writer) LastRecordOffset() (int64, error)

LastRecordOffset returns the offset in the underlying io.Writer of the last record so far - the one created by the most recent Next call. It is the offset of the first chunk header, suitable to pass to Reader.SeekRecord.

If that io.Writer also implements io.Seeker, the return value is an absolute offset, in the sense of io.SeekStart, regardless of whether the io.Writer was initially at the zero position when passed to NewWriter. Otherwise, the return value is a relative offset, being the number of bytes written between the NewWriter call and any records written prior to the last record.

If there is no last record, i.e. nothing was written, LastRecordOffset will return ErrNoLastRecord.

func (*Writer) Next

func (w *Writer) Next() (io.Writer, error)

Next returns a writer for the next record. The writer returned becomes stale after the next Close, Flush or Next call, and should no longer be used.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL