Documentation ¶
Index ¶
- Constants
- Variables
- type JDB
- func (j *JDB) Close() (err error)
- func (j *JDB) Insert(m *Measurement) (err error)
- func (j *JDB) QueryAll(name string, opts *Options) (m []*Measurement, err error)
- func (j *JDB) QueryAllCSV(name string, opts *Options) (b []byte, err error)
- func (j *JDB) QueryAllIndex(name, index, indexValue string, opts *Options) (m []*Measurement, err error)
- func (j *JDB) QueryFields(measurement string) (fields []string, err error)
- type Measurement
- type Options
Examples ¶
Constants ¶
const ( // DefaultIndexName is used for Measurements where an Index // hasn't beed specified so we can still de-dupe it. DefaultIndexName = "_default_index" )
Variables ¶
var ( // Logger can be used to log database internal operations for various // info statements, or left as the default- which wont log anything Logger = slog.New(slog.NewTextHandler(io.Discard, nil)) // If the save buffer hits `FlushMaxSize` length then // flush to disk FlushMaxSize = 1_000 // If the save buffer hasn't been flushed for `FlushMaxDuration` or // longer then flush to disk FlushMaxDuration = time.Hour // ErrNoSuchMeasurement returns when trying to retrieve a Measurement // that hasn't been indexed by this JDB instance ErrNoSuchMeasurement = errors.New("unknown measurement name") // ErrNoSuchIndex returns for calls to QueryAllIndex where the index in // question does not exist for the specified Measurement ErrNoSuchIndex = errors.New("unknown index") // ErrDuplicateMeasurement returns when trying to Insert a Measurement, where // there is already a Measurement with the same derived ID // // These IDs are derived in such a way that they have a Nanosecond precision // against a particular measurement + index name + index value and so receiving // this error is a problem, and may point toward reusing/ not correctly // setting the value of Measurement.When ErrDuplicateMeasurement = errors.New("measurement and index combination exist for this timestamp") )
Functions ¶
This section is empty.
Types ¶
type JDB ¶
type JDB struct {
// contains filtered or unexported fields
}
JDB is an embeddable Schemaless Timeseries Database, queried in-memory, and with on-disc persistence.
It is deliberately naive and is designed to be 'good-enough'. It wont solve all of your woes, it wont handle petabytes of scale, and it wont make your applications more enterprisey.
It will, however, give you a reasonably quick way of storing timeseries, querying against an index or time range, and provide de-duplication gaurantees.
func New ¶
New returns a JDB from a databse file on disk, creating the database file if it doesn't already exist.
New returns errors in the following contexts:
- Where the OS can't open a database file for writing
- The file it has opened isn't valid for JDB
This function outputs optional logs, which can be enabled by setting `jdb.Logger` to a valid `slog.Logger`
Example (Create_close_reopen_database) ¶
package main import ( "fmt" "os" "github.com/jspc/jdb" ) func main() { f, err := os.CreateTemp("", "") if err != nil { panic(err) } f.Close() // Effectively disable flushing to disk for the sake of // timeliness in this test jdb.FlushMaxSize = 1_000_000 jdb.FlushMaxDuration = 1<<63 - 1 database, err := jdb.New(f.Name()) if err != nil { panic(err) } err = database.Insert(&jdb.Measurement{Name: "counters", Dimensions: map[string]float64{"Counter": 1234}}) if err != nil { panic(err) } // Query database m, err := database.QueryAll("counters", nil) if err != nil { panic(err) } fmt.Printf("counters: %d\n", len(m)) // Close database database.Close() // Reopen, reconcile for same data database, err = jdb.New(f.Name()) if err != nil { panic(err) } m, err = database.QueryAll("counters", nil) if err != nil { panic(err) } fmt.Printf("counters: %d\n", len(m)) }
Output: counters: 1 counters: 1
Example (Create_database_and_query_index) ¶
package main import ( "fmt" "os" "time" "github.com/jspc/jdb" ) func main() { f, err := os.CreateTemp("", "") if err != nil { panic(err) } f.Close() // Effectively disable flushing to disk for the sake of // timeliness in this test jdb.FlushMaxSize = 1_000_000 jdb.FlushMaxDuration = 1<<63 - 1 database, err := jdb.New(f.Name()) if err != nil { panic(err) } defer database.Close() t := time.Time{} for i := 0; i < 1000; i++ { t = t.Add(time.Minute) m := &jdb.Measurement{ When: t, Name: "environmental_monitoring", Dimensions: map[string]float64{ "Temperature": 19.23, "Humidity": 52.43234, "AQI": 1, }, Labels: map[string]string{ "sensor_version": "v1.0.1", "uptime": "1h31m6s", }, Indices: map[string]string{ "location": "living room", }, } err = m.Validate() if err != nil { panic(err) } err = database.Insert(m) if err != nil { panic(err) } } // Query an empty index measurements, err := database.QueryAllIndex("environmental_monitoring", "location", "bedroom", nil) if err != nil { panic(err) } fmt.Printf("measurements where location == bedroom: %d\n", len(measurements)) // Query an index with items measurements, err = database.QueryAllIndex("environmental_monitoring", "location", "living room", new(jdb.Options)) if err != nil { panic(err) } fmt.Printf("measurements where location == 'living room': %d\n", len(measurements)) }
Output: measurements where location == bedroom: 0 measurements where location == 'living room': 1000
func (*JDB) Insert ¶
func (j *JDB) Insert(m *Measurement) (err error)
Insert a Measurement into the database.
Insert does this by performing a handful of tasks:
- Insert will call m.Validate() to ensure the data is correct
- Check whether we've already received this Measurement, erroring if so
- Adding the Measurement to the underlying data structure(s)
- Updating Measurement metadata (field names, indices, etc.)
- Persisting to disk if the write buffer is full, or it's been some time since the last write
Because we're using slices and maps under the hood without intermediate buffers, this call relies on mutexes that may be slow at times.
The upshot of this is that calls to Insert are immediately consistent.
func (*JDB) QueryAll ¶
func (j *JDB) QueryAll(name string, opts *Options) (m []*Measurement, err error)
QueryAll queries for a Measurement name, returning all Measurements that fit.
When opts is not nil, the specified time slicing options are used to return a subset of Measurements.
For the purposes of time slicing, setting opts to nil has identical behaviour to setting it to empty, such as `&jdb.Options{}`, or `new(jdb.Options)`- though setting opts as nil saves a chunk of cycles and is, therefore, marginallty more efficient
func (*JDB) QueryAllCSV ¶
QueryAllCSV works identically to `QueryAll` (in fact it calls `QueryAll` under the hood), but returns Measurements as a []byte representation of the generated CSV.
It can be quite expensive for large datasets.
This function can be used to load data into other sources, such as jupyter, or a spreadsheet.
When opts is not nil, the specified time slicing options are used to return a subset of Measurements.
For the purposes of time slicing, setting opts to nil has identical behaviour to setting it to empty, such as `&jdb.Options{}`, or `new(jdb.Options)`- though setting opts as nil saves a chunk of cycles and is, therefore, marginallty more efficient
func (*JDB) QueryAllIndex ¶
func (j *JDB) QueryAllIndex(name, index, indexValue string, opts *Options) (m []*Measurement, err error)
QueryAllIndex queries for a Measurement name, returning all Measurements with a specific Index value.
When opts is not nil, the specified time slicing options are used to return a subset of Measurements.
For the purposes of time slicing, setting opts to nil has identical behaviour to setting it to empty, such as `&jdb.Options{}`, or `new(jdb.Options)`- though setting opts as nil saves a chunk of cycles and is, therefore, marginallty more efficient
type Measurement ¶
type Measurement struct { When time.Time `json:"when"` Name string `json:"name"` Dimensions map[string]float64 `json:"dimensions"` Labels map[string]string `json:"labels"` Indices map[string]string `json:"indices"` }
A Measurement represents a collection of values and metadata to store against a timestamp.
It contains a timestamp, measurement name, some dimensions, some indices, and some labels.
In our world, a Measurement Name might be analogous to a database name. A Measurement has one or more numerical Dimensions, some labels and some indices.
The only differences between a label and an index is that an index is searchable and a label isn't. Because of this, an index takes up more memory space and so isn't always appropriate. If you're never going to need to search for a given string then it's best off using a label for the sake of resources and speed.
Internally, Measurements are deduplicated by deriving a Measurement ID of the format:
id := name + \0x00 + indexName + \0x00 + indexValue + \0x00 + measurement_timestamp_in_nanoseconds + \0x00
and then base64 encoded.
This does mean there's the potential for collisions, should multiple Measurements have the same name, index, and timestamp (to the nanosecond); it's _unlikely_ to happen, but it's possible. With this in mind, indexing on a sensor ID, or something unique to the creator of a Measurement is always smart
func (*Measurement) Validate ¶
func (m *Measurement) Validate() error
Validate returns an error if:
- The Measurement name is empty
- The Measurement has no Dimensions
If the Measurement has no indices, we create one called `_default_index` with the same value as the Measurement name. This exists purely to make deduplication easier and can be ignored by pretty much everything
Without these three elements, a Measurement is functionally meaningless
type Options ¶ added in v0.2.0
type Options struct { // From defines the earliest timestamp to return Measurements // for. It is inclusive, which is to say that if the time is set // to `14:45:00 30th April 2024` and there is a record with that // precise timestamp, then that record will be included. // // This field is ignored if `Since` is set. If this field is unset // and To is set then From implies "All data from the start of time" From time.Time `json:"from" form:"from"` // To defines the latest timestamp to return Measurements for. // Similarly to From, if this field is empty and From is set, then // the implication is "All records from `From` to the end". // // If both this field and Since are set, then JDB returns the last // `Since` duration _to_ To To time.Time `json:"to" form:"to"` // Since returns Measurements created within the Duration covered by // this field. If `To` is unset, then Since returns up until the // current time Since time.Duration `json:"since" form:"since"` }
Options can be passed to Query* functions on Database and allow for slicing Measurements based on timestamps according to the rules below