tstorage

package module
v0.5.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 4, 2023 License: Apache-2.0 Imports: 23 Imported by: 0

README

tstorage Go Reference

tstorage is a lightweight local on-disk storage engine for time-series data with a straightforward API. Especially ingestion is massively optimized as it provides goroutine safe capabilities of write into and read from TSDB that partitions data points by time.

Motivation

I'm working on a couple of tools that handle a tremendous amount of time-series data, such as Ali and Gosivy. Especially Ali, I had been facing a problem of increasing heap consumption over time as it's a load testing tool that aims to perform real-time analysis. I little poked around a fast TSDB library that offers simple APIs but eventually nothing works as well as I'd like, that's why I settled on writing this package myself.

To see how much tstorage has helped improve Ali's performance, see the release notes here.

Usage

Currently, tstorage requires Go version 1.16 or greater

By default, tstorage.Storage works as an in-memory database. The below example illustrates how to insert a row into the memory and immediately select it.

package main

import (
	"fmt"

	"github.com/CoinSummer/tstorage"
)

func main() {
	storage, _ := tstorage.NewStorage(
		tstorage.WithTimestampPrecision(tstorage.Seconds),
	)
	defer storage.Close()

	_ = storage.InsertRows([]tstorage.Row{
		{
			Metric: "metric1",
			DataPoint: tstorage.DataPoint{Timestamp: 1600000000, Value: 0.1},
		},
	})
	points, _ := storage.Select("metric1", nil, 1600000000, 1600000001)
	for _, p := range points {
		fmt.Printf("timestamp: %v, value: %v\n", p.Timestamp, p.Value)
		// => timestamp: 1600000000, value: 0.1
	}
}
Using disk

To make time-series data persistent on disk, specify the path to directory that stores time-series data through WithDataPath option.

storage, _ := tstorage.NewStorage(
	tstorage.WithDataPath("./data"),
)
defer storage.Close()
Labeled metrics

In tstorage, you can identify a metric with combination of metric name and optional labels. Here is an example of insertion a labeled metric to the disk.

metric := "mem_alloc_bytes"
labels := []tstorage.Label{
	{Name: "host", Value: "host-1"},
}

_ = storage.InsertRows([]tstorage.Row{
	{
		Metric:    metric,
		Labels:    labels,
		DataPoint: tstorage.DataPoint{Timestamp: 1600000000, Value: 0.1},
	},
})
points, _ := storage.Select(metric, labels, 1600000000, 1600000001)

For more examples see the documentation.

Benchmarks

Benchmark tests were made using Intel(R) Core(TM) i7-8559U CPU @ 2.70GHz with 16GB of RAM on macOS 10.15.7

$ go version
go version go1.16.2 darwin/amd64

$ go test -benchtime=4s -benchmem -bench=. .
goos: darwin
goarch: amd64
pkg: github.com/CoinSummer/tstorage
cpu: Intel(R) Core(TM) i7-8559U CPU @ 2.70GHz
BenchmarkStorage_InsertRows-8                  	14135685	       305.9 ns/op	     174 B/op	       2 allocs/op
BenchmarkStorage_SelectAmongThousandPoints-8   	20548806	       222.4 ns/op	      56 B/op	       2 allocs/op
BenchmarkStorage_SelectAmongMillionPoints-8    	16185709	       292.2 ns/op	      56 B/op	       1 allocs/op
PASS
ok  	github.com/CoinSummer/tstorage	16.501s

Internal

Time-series database has specific characteristics in its workload. In terms of write operations, a time-series database has to ingest a tremendous amount of data points ordered by time. Time-series data is immutable, mostly an append-only workload with delete operations performed in batches on less recent data. In terms of read operations, in most cases, we want to retrieve multiple data points by specifying its time range, also, most recent first: query the recent data in real-time. Besides, time-series data is already indexed in time order.

Based on these characteristics, tstorage adopts a linear data model structure that partitions data points by time, totally different from the B-trees or LSM trees based storage engines. Each partition acts as a fully independent database containing all data points for its time range.

  │                 │
Read              Write
  │                 │
  │                 V
  │      ┌───────────────────┐ max: 1600010800
  ├─────>   Memory Partition
  │      └───────────────────┘ min: 1600007201
  │
  │      ┌───────────────────┐ max: 1600007200
  ├─────>   Memory Partition
  │      └───────────────────┘ min: 1600003601
  │
  │      ┌───────────────────┐ max: 1600003600
  └─────>   Disk Partition
         └───────────────────┘ min: 1600000000

Key benefits:

  • We can easily ignore all data outside of the partition time range when querying data points.
  • Most read operations work fast because recent data get cached in heap.
  • When a partition gets full, we can persist the data from our in-memory database by sequentially writing just a handful of larger files. We avoid any write-amplification and serve SSDs and HDDs equally well.
Memory partition

The memory partition is writable and stores data points in heap. The head partition is always memory partition. Its next one is also memory partition to accept out-of-order data points. It stores data points in an ordered Slice, which offers excellent cache hit ratio compared to linked lists unless it gets updated way too often (like delete, add elements at random locations).

All incoming data is written to a write-ahead log (WAL) right before inserting into a memory partition to prevent data loss.

Disk partition

The old memory partitions get compacted and persisted to the directory prefixed with p-, under the directory specified with the WithDataPath option. Here is the macro layout of disk partitions:

$ tree ./data
./data
├── p-1600000001-1600003600
│   ├── data
│   └── meta.json
├── p-1600003601-1600007200
│   ├── data
│   └── meta.json
└── p-1600007201-1600010800
    ├── data
    └── meta.json

As you can see each partition holds two files: meta.json and data. The data is compressed, read-only and is memory-mapped with mmap(2) that maps a kernel address space to a user address space. Therefore, what it has to store in heap is only partition's metadata. Just looking at meta.json gives us a good picture of what it stores:

$ cat ./data/p-1600000001-1600003600/meta.json
{
  "minTimestamp": 1600000001,
  "maxTimestamp": 1600003600,
  "numDataPoints": 7200,
  "metrics": {
    "metric-1": {
      "name": "metric-1",
      "offset": 0,
      "minTimestamp": 1600000001,
      "maxTimestamp": 1600003600,
      "numDataPoints": 3600
    },
    "metric-2": {
      "name": "metric-2",
      "offset": 36014,
      "minTimestamp": 1600000001,
      "maxTimestamp": 1600003600,
      "numDataPoints": 3600
    }
  }
}

Each metric has its own file offset of the beginning. Data point slice for each metric is compressed separately, so all we have to do when reading is to seek, and read the points off.

Out-of-order data points

What data points get out-of-order in real-world applications is not uncommon because of network latency or clock synchronization issues; tstorage basically doesn't discard them. If out-of-order data points are within the range of the head memory partition, they get temporarily buffered and merged at flush time. Sometimes we should handle data points that cross a partition boundary. That is the reason why tstorage keeps more than one partition writable.

More

Want to know more details on tstorage internal? If so see the blog post: Write a time-series database engine from scratch.

Acknowledgements

This package is implemented based on tons of existing ideas. What I especially got inspired by are:

A big "thank you!" goes out to all of them.

Documentation

Overview

Package tstorage provides goroutine safe capabilities of insertion into and retrieval from the time-series storage.

Index

Examples

Constants

This section is empty.

Variables

View Source
var (
	ErrNoDataPoints = errors.New("no data points found")
)

Functions

This section is empty.

Types

type DataPoint

type DataPoint struct {
	// The actual value. This field must be set.
	Value float64
	// Unix timestamp.
	Timestamp int64
}

DataPoint represents a data point, the smallest unit of time series data.

type Label

type Label struct {
	Name  string
	Value string
}

Label is a time-series label. A label with missing name or value is invalid.

type Logger

type Logger interface {
	Printf(format string, v ...interface{})
}

Logger is a logging interface

type Option

type Option func(*storage)

Option is an optional setting for NewStorage.

func WithDataPath

func WithDataPath(dataPath string) Option

WithDataPath specifies the path to directory that stores time-series data. Use this to make time-series data persistent on disk.

Defaults to empty string which means no data will get persisted.

func WithLogger

func WithLogger(logger Logger) Option

WithLogger specifies the logger to emit verbose output.

Defaults to a logger implementation that does nothing.

func WithPartitionDuration

func WithPartitionDuration(duration time.Duration) Option

WithPartitionDuration specifies the timestamp range of partitions. Once it exceeds the given time range, the new partition gets inserted.

A partition is a chunk of time-series data with the timestamp range. It acts as a fully independent database containing all data points for its time range.

Defaults to 1h

func WithRetention

func WithRetention(retention time.Duration) Option

WithRetention specifies when to remove old data. Data points will get automatically removed from the disk after a specified period of time after a disk partition was created. Defaults to 14d.

func WithTimestampPrecision

func WithTimestampPrecision(precision TimestampPrecision) Option

WithTimestampPrecision specifies the precision of timestamps to be used by all operations.

Defaults to Nanoseconds

func WithWALBufferedSize

func WithWALBufferedSize(size int) Option

WithWAL specifies the buffered byte size before flushing a WAL file. The larger the size, the less frequently the file is written and more write performance at the expense of durability. Giving 0 means it writes to a file whenever data point comes in. Giving -1 disables using WAL.

Defaults to 4096.

func WithWriteTimeout

func WithWriteTimeout(timeout time.Duration) Option

WithWriteTimeout specifies the timeout to wait when workers are busy.

The storage limits the number of concurrent goroutines to prevent from out of memory errors and CPU trashing even if too many goroutines attempt to write.

Defaults to 30s.

type Reader

type Reader interface {
	// Select gives back a list of data points that matches a set of the given metric and
	// labels within the given start-end range. Keep in mind that start is inclusive, end is exclusive,
	// and both must be Unix timestamp. ErrNoDataPoints will be returned if no data points found.
	Select(metric string, labels []Label, start, end int64) (points []*DataPoint, err error)
	LastN(metric string, labels []Label, n int64) (points []*DataPoint, err error)
}

Reader provides reading access to time series data.

type Row

type Row struct {
	// The unique name of metric.
	// This field must be set.
	Metric string
	// An optional key-value properties to further detailed identification.
	Labels []Label
	// This field must be set.
	DataPoint
}

Row includes a data point along with properties to identify a kind of metrics.

type Storage

type Storage interface {
	Reader
	// InsertRows ingests the given rows to the time-series storage.
	// If the timestamp is empty, it uses the machine's local timestamp in UTC.
	// The precision of timestamps is nanoseconds by default. It can be changed using WithTimestampPrecision.
	InsertRows(rows []Row) error
	// Close gracefully shutdowns by flushing any unwritten data to the underlying disk partition.
	Close() error
}

Storage provides goroutine safe capabilities of insertion into and retrieval from the time-series storage.

func NewStorage

func NewStorage(opts ...Option) (Storage, error)

NewStorage gives back a new storage, which stores time-series data in the process memory by default.

Give the WithDataPath option for running as a on-disk storage. Specify a directory with data already exists, then it will be read as the initial data.

Example (WithDataPath)
package main

import (
	"github.com/CoinSummer/tstorage"
)

func main() {
	// It will make time-series data persistent under "./data".
	storage, err := tstorage.NewStorage(
		tstorage.WithDataPath("./data"),
	)
	if err != nil {
		panic(err)
	}
	storage.Close()
}
Output:

Example (WithPartitionDuration)
package main

import (
	"time"

	"github.com/CoinSummer/tstorage"
)

func main() {
	storage, err := tstorage.NewStorage(
		tstorage.WithPartitionDuration(30*time.Minute),
		tstorage.WithTimestampPrecision(tstorage.Seconds),
	)
	if err != nil {
		panic(err)
	}
	defer storage.Close()
}
Output:

type TimestampPrecision

type TimestampPrecision string

TimestampPrecision represents precision of timestamps. See WithTimestampPrecision

const (
	Nanoseconds  TimestampPrecision = "ns"
	Microseconds TimestampPrecision = "us"
	Milliseconds TimestampPrecision = "ms"
	Seconds      TimestampPrecision = "s"
)

Directories

Path Synopsis
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL