badger

package

v1.63.0 Latest Latest Go to latest Published: Nov 10, 2024 License: Apache-2.0 Imports: 24 Imported by: 10

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/jaegertracing/jaeger

Links

Open Source Insights

README ¶

Badger data storage

Data modeling

The key design in badger storage backend takes advantage of sorted nature of badger as well as the key only searching with badger, which does not require loading the values from the value log but only the keys in the LSM tree. This is used to implement efficient inverted index for both lookups as well as range searches. All the values in the keys must be stored in big endian ordering to make the sorting work properly.

Index keys structure is created in the createIndexKey in spanstore/writer.go and the primary key for spans in createTraceKV.

Primary key design

Primary keys are the only keys that have a value in the badger's storage. Each key presents a single span, thus a single trace is a collection of tuples. The value is the actual span, which is marshalled into bytes. The marshalling format is indicated by the last 4 bits of the meta encoding byte in the badger entry.

Primary keys are sorted as follows:

TraceID High
TraceID Low
Timestamp
SpanID

This allows quick lookup for a single TraceID by searching for prefix with: 0x80 + traceID high + traceID low and then iterating as long as that prefix is valid. Note that timestamp ordering does not allow fetching a range of traces in a time range.

Index key design

Each index key has a single byte first to indicate which field is indexed. The last 4 bits of the first byte in the key are used to indicate which index key is used, with the first 4 bits being zeroed. This sorts the LSM tree by index field which allows quicker range queries. Each inverted index key is then sorted in the following order:

Value
Timestamp
TraceID High
TraceID Low

That means the scanning for a single value can continue until we reach the first timestamp which is not in the boundaries and then stop since we can guarantee the future keys are not going to be valid.

Index searches

If the lookup is a single traceID, the logic mentioned in the Primary key design section is used. If instead we have a TraceQueryParameters with one or more search keys to use, we need to combine the results of multiple index seeks to form an intersection of those results. Each search parameter (each tag is new search parameter) is used to scan single index key, thus we iterate the index until the <indexKey><value><timestamp> is no longer valid. We do this by checking the prefix for <indexKey><value> for exactness and then <timestamp> for range. As long as that one is valid, we fetch the keys. Once the timestamp goes beyond our maximum timestamp, the iteration stops. The keys are then sorted to TraceID order instead of their natural key ordering for the next part.

Exception to the above is the duration index, since there are no exact duration values but a range of values. When scanning it, the prefix search lookups the starting point with <indexKey><minDurationValue> and scans the index until <indexKey><maxDurationValue> is reached. Each key is then separately checked for valid <timestamp> but the timestamp does not control the seek process and some keys are ignored because they did not match the given time range.

Because each TraceID is stored as spans, the same TraceID can appear multiple times from a index query. Other than duration query, this means they are coming in order so each of them is discarded by easily checking if the previous one is equal to current one, but with the duration index the spans can come in random order and thus hash-join is used to filter the duplicates.

After all the index keys have been scanned, the process is then sent to the merge-join where two index queries are compared and only matching IDs are taken. After that, the next one is compared to the result of the previous and so forth until all the index fetches have been processed. The resulting query set is the list of TraceIDs that matched all the requirements.

Documentation ¶

Rendered for

Index ¶

type Config
- func DefaultConfig() *Config
type Directories
type Factory
- func NewFactory() *Factory
- func NewFactoryWithConfig(cfg Config, metricsFactory metrics.Factory, logger *zap.Logger) (*Factory, error)
type TTL

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Config ¶ added in v1.61.0

type Config struct {
	// TTL holds time-to-live configuration for the badger store.
	TTL TTL `mapstructure:"ttl"`
	// Directories contains the configuration for where items are stored. Ephemeral must be
	// set to false for this configuration to take effect.
	Directories Directories `mapstructure:"directories"`
	// Ephemeral, if set to true, will store data in a temporary file system.
	// If set to true, the configuration in Directories is ignored.
	Ephemeral bool `mapstructure:"ephemeral"`
	// SyncWrites, if set to true, will immediately sync all writes to disk. Note that
	// setting this field to true will affect write performance.
	SyncWrites bool `mapstructure:"consistency"`
	// MaintenanceInterval is the regular interval after which a maintenance job is
	// run on the values in the store.
	MaintenanceInterval time.Duration `mapstructure:"maintenance_interval"`
	// MetricsUpdateInterval is the regular interval after which metrics are collected
	// by Jaeger.
	MetricsUpdateInterval time.Duration `mapstructure:"metrics_update_interval"`
	// ReadOnly opens the data store in read-only mode. Multiple instances can open the same
	// store in read-only mode. Values still in the write-ahead-log must be replayed before opening.
	ReadOnly bool `mapstructure:"read_only"`
}

Config is badger's internal configuration data.

func DefaultConfig ¶ added in v1.61.0

func DefaultConfig() *Config

func (*Config) AddFlags ¶ added in v1.61.0

func (c *Config) AddFlags(flagSet *flag.FlagSet)

AddFlags adds flags for Config.

func (*Config) InitFromViper ¶ added in v1.61.0

func (c *Config) InitFromViper(v *viper.Viper, logger *zap.Logger)

InitFromViper initializes Config with properties from viper.

func (*Config) Validate ¶ added in v1.61.0

func (c *Config) Validate() error

type Directories ¶ added in v1.61.0

type Directories struct {
	// Keys contains the directory in which the keys are stored.
	Keys string `mapstructure:"keys"`
	// Values contains the directory in which the values are stored.
	Values string `mapstructure:"values"`
}

type Factory ¶

type Factory struct {
	Config *Config
	// contains filtered or unexported fields
}

Factory implements storage.Factory for Badger backend.

func NewFactory ¶

func NewFactory() *Factory

NewFactory creates a new Factory.

func NewFactoryWithConfig ¶ added in v1.54.0

func NewFactoryWithConfig(
	cfg Config,
	metricsFactory metrics.Factory,
	logger *zap.Logger,
) (*Factory, error)

func (*Factory) AddFlags ¶

func (f *Factory) AddFlags(flagSet *flag.FlagSet)

AddFlags implements plugin.Configurable

func (*Factory) Close ¶

func (f *Factory) Close() error

Close Implements io.Closer and closes the underlying storage

func (*Factory) CreateDependencyReader ¶

func (f *Factory) CreateDependencyReader() (dependencystore.Reader, error)

CreateDependencyReader implements storage.Factory

func (*Factory) CreateLock ¶ added in v1.52.0

func (*Factory) CreateLock() (distributedlock.Lock, error)

CreateLock implements storage.SamplingStoreFactory

func (*Factory) CreateSamplingStore ¶ added in v1.51.0

func (f *Factory) CreateSamplingStore(int) (samplingstore.Store, error)

CreateSamplingStore implements storage.SamplingStoreFactory

func (*Factory) CreateSpanReader ¶

func (f *Factory) CreateSpanReader() (spanstore.Reader, error)

CreateSpanReader implements storage.Factory

func (*Factory) CreateSpanWriter ¶

func (f *Factory) CreateSpanWriter() (spanstore.Writer, error)

CreateSpanWriter implements storage.Factory

func (*Factory) InitFromViper ¶

func (f *Factory) InitFromViper(v *viper.Viper, logger *zap.Logger)

InitFromViper implements plugin.Configurable

func (*Factory) Initialize ¶

func (f *Factory) Initialize(metricsFactory metrics.Factory, logger *zap.Logger) error

Initialize implements storage.Factory

func (*Factory) Purge ¶ added in v1.57.0

func (f *Factory) Purge(_ context.Context) error

Purge removes all data from the Factory's underlying Badger store. This function is intended for testing purposes only and should not be used in production environments. Calling Purge in production will result in permanent data loss.

type TTL ¶ added in v1.61.0

type TTL struct {
	// SpanStore holds the amount of time that the span store data is stored.
	// Once this duration has passed for a given key, span store data will
	// no longer be accessible.
	Spans time.Duration `mapstructure:"spans"`
}

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
dependencystore
samplingstore
spanstore

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL