Documentation ¶
Overview ¶
Package record reads and writes sequences of records. Each record is a stream of bytes that completes before the next record starts.
When reading, call Next to obtain an io.Reader for the next record. Next will return io.EOF when there are no more records. It is valid to call Next without reading the current record to exhaustion.
When writing, call Next to obtain an io.Writer for the next record. Calling Next finishes the current record. Call Close to finish the final record.
Optionally, call Flush to finish the current record and flush the underlying writer without starting a new record. To start a new record after flushing, call Next.
Neither Readers or Writers are safe to use concurrently.
Example code:
func read(r io.Reader) ([]string, error) { var ss []string records := record.NewReader(r) for { rec, err := records.Next() if err == io.EOF { break } if err != nil { log.Printf("recovering from %v", err) r.Recover() continue } s, err := io.ReadAll(rec) if err != nil { log.Printf("recovering from %v", err) r.Recover() continue } ss = append(ss, string(s)) } return ss, nil } func write(w io.Writer, ss []string) error { records := record.NewWriter(w) for _, s := range ss { rec, err := records.Next() if err != nil { return err } if _, err := rec.Write([]byte(s)), err != nil { return err } } return records.Close() }
The wire format is that the stream is divided into 32KiB blocks, and each block contains a number of tightly packed chunks. Chunks cannot cross block boundaries. The last block may be shorter than 32 KiB. Any unused bytes in a block must be zero.
A record maps to one or more chunks. There are two chunk formats: legacy and recyclable. The legacy chunk format:
+----------+-----------+-----------+--- ... ---+ | CRC (4B) | Size (2B) | Type (1B) | Payload | +----------+-----------+-----------+--- ... ---+
CRC is computed over the type and payload Size is the length of the payload in bytes Type is the chunk type
There are four chunk types: whether the chunk is the full record, or the first, middle or last chunk of a multi-chunk record. A multi-chunk record has one first chunk, zero or more middle chunks, and one last chunk.
The recyclyable chunk format is similar to the legacy format, but extends the chunk header with an additional log number field. This allows reuse (recycling) of log files which can provide significantly better performance when syncing frequently as it avoids needing to update the file metadata. Additionally, recycling log files is a prequisite for using direct IO with log writing. The recyclyable format is:
+----------+-----------+-----------+----------------+--- ... ---+ | CRC (4B) | Size (2B) | Type (1B) | Log number (4B)| Payload | +----------+-----------+-----------+----------------+--- ... ---+
Recyclable chunks are distinguished from legacy chunks by the addition of 4 extra "recyclable" chunk types that map directly to the legacy chunk types (i.e. full, first, middle, last). The CRC is computed over the type, log number, and payload.
The wire format allows for limited recovery in the face of data corruption: on a format error (such as a checksum mismatch), the reader moves to the next block and looks for the next full or first chunk.
Index ¶
Constants ¶
const ( // SyncConcurrency is the maximum number of concurrent sync operations that // can be performed. Note that a sync operation is initiated either by a call // to SyncRecord or by a call to Close. Exported as this value also limits // the commit concurrency in commitPipeline. SyncConcurrency = 1 << syncConcurrencyBits )
Variables ¶
var ( // ErrNotAnIOSeeker is returned if the io.Reader underlying a Reader does not implement io.Seeker. ErrNotAnIOSeeker = errors.New("pebble/record: reader does not implement io.Seeker") // ErrNoLastRecord is returned if LastRecordOffset is called and there is no previous record. ErrNoLastRecord = errors.New("pebble/record: no last record exists") // ErrZeroedChunk is returned if a chunk is encountered that is zeroed. This // usually occurs due to log file preallocation. ErrZeroedChunk = base.CorruptionErrorf("pebble/record: zeroed chunk") // ErrInvalidChunk is returned if a chunk is encountered with an invalid // header, length, or checksum. This usually occurs when a log is recycled, // but can also occur due to corruption. ErrInvalidChunk = base.CorruptionErrorf("pebble/record: invalid chunk") )
Functions ¶
func IsInvalidRecord ¶
IsInvalidRecord returns true if the error matches one of the error types returned for invalid records. These are treated in a way similar to io.EOF in recovery code.
Types ¶
type LogWriter ¶
type LogWriter struct {
// contains filtered or unexported fields
}
LogWriter writes records to an underlying io.Writer. In order to support WAL file reuse, a LogWriter's records are tagged with the WAL's file number. When reading a log file a record from a previous incarnation of the file will return the error ErrInvalidLogNum.
func NewLogWriter ¶
NewLogWriter returns a new LogWriter.
func (*LogWriter) Close ¶
Close flushes and syncs any unwritten data and closes the writer. Where required, external synchronisation is provided by commitPipeline.mu.
func (*LogWriter) Metrics ¶
func (w *LogWriter) Metrics() *LogWriterMetrics
Metrics must be called after Close. The callee will no longer modify the returned LogWriterMetrics.
func (*LogWriter) Size ¶
Size returns the current size of the file. External synchronisation provided by commitPipeline.mu.
func (*LogWriter) SyncRecord ¶
func (w *LogWriter) SyncRecord( p []byte, wg *sync.WaitGroup, err *error, ) (logSize int64, err2 error)
SyncRecord writes a complete record. If wg != nil the record will be asynchronously persisted to the underlying writer and done will be called on the wait group upon completion. Returns the offset just past the end of the record. External synchronisation provided by commitPipeline.mu.
type LogWriterConfig ¶
type LogWriterConfig struct { WALMinSyncInterval durationFunc WALFsyncLatency prometheus.Histogram // QueueSemChan is an optional channel to pop from when popping from // LogWriter.flusher.syncQueue. It functions as a semaphore that prevents // the syncQueue from overflowing (which will cause a panic). All production // code ensures this is non-nil. QueueSemChan chan struct{} }
LogWriterConfig is a struct used for configuring new LogWriters
type LogWriterMetrics ¶
type LogWriterMetrics struct { WriteThroughput base.ThroughputMetric PendingBufferLen base.GaugeSampleMetric SyncQueueLen base.GaugeSampleMetric }
LogWriterMetrics contains misc metrics for the log writer.
func (*LogWriterMetrics) Merge ¶
func (m *LogWriterMetrics) Merge(x *LogWriterMetrics) error
Merge merges metrics from x. Requires that x is non-nil.
type Reader ¶
type Reader struct {
// contains filtered or unexported fields
}
Reader reads records from an underlying io.Reader.
func NewReader ¶
NewReader returns a new reader. If the file contains records encoded using the recyclable record format, then the log number in those records must match the specified logNum.
type RotationHelper ¶
type RotationHelper struct {
// contains filtered or unexported fields
}
RotationHelper is a type used to inform the decision of rotating a record log file.
The assumption is that multiple records can be coalesced into a single record (called a snapshot). Starting a new file, where the first record is a snapshot of the current state is referred to as "rotating" the log.
Normally we rotate files when a certain file size is reached. But in certain cases (e.g. contents become very large), this can result in too frequent rotation. This helper contains logic to impose extra conditions on the rotation.
The rotation helper uses "size" as a unit-less estimation that is correlated with the on-disk size of a record or snapshot.
func (*RotationHelper) AddRecord ¶
func (rh *RotationHelper) AddRecord(recordSize int64)
AddRecord makes the rotation helper aware of a new record.
func (*RotationHelper) DebugInfo ¶
func (rh *RotationHelper) DebugInfo() (lastSnapshotSize int64, sizeSinceLastSnapshot int64)
DebugInfo returns the last snapshot size and size of the edits since the last snapshot; used for testing and debugging.
func (*RotationHelper) Rotate ¶
func (rh *RotationHelper) Rotate(snapshotSize int64)
Rotate makes the rotation helper aware that we are rotating to a new snapshot (to which we will apply the latest edit).
func (*RotationHelper) ShouldRotate ¶
func (rh *RotationHelper) ShouldRotate(nextSnapshotSize int64) bool
ShouldRotate returns whether we should start a new log file (with a snapshot). Does not need to be called if other rotation factors (log file size) are not satisfied.
type Writer ¶
type Writer struct {
// contains filtered or unexported fields
}
Writer writes records to an underlying io.Writer.
func (*Writer) Flush ¶
Flush finishes the current record, writes to the underlying writer, and flushes it if that writer implements interface{ Flush() error }.
func (*Writer) LastRecordOffset ¶
LastRecordOffset returns the offset in the underlying io.Writer of the last record so far - the one created by the most recent Next call. It is the offset of the first chunk header, suitable to pass to Reader.SeekRecord.
If that io.Writer also implements io.Seeker, the return value is an absolute offset, in the sense of io.SeekStart, regardless of whether the io.Writer was initially at the zero position when passed to NewWriter. Otherwise, the return value is a relative offset, being the number of bytes written between the NewWriter call and any records written prior to the last record.
If there is no last record, i.e. nothing was written, LastRecordOffset will return ErrNoLastRecord.