common

package
v0.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 27, 2018 License: Apache-2.0 Imports: 13 Imported by: 0

Documentation

Index

Constants

View Source
const (
	// OldHLLDataHeader is the old magic header for migration
	OldHLLDataHeader uint32 = 0xACED0101
	// HLLDataHeader is the magic header written into serialized format of hyperloglog query result.
	HLLDataHeader uint32 = 0xACED0102
	// EnumDelimiter is the delimiter to delimit enum cases.
	EnumDelimiter = "\u0000\n"
	// DenseDataLength is the length of hll dense data in bytes.
	DenseDataLength = 1 << 14 // 16kb
	// DenseThreshold is the thresold to convert sparse value to dense value.
	DenseThreshold = DenseDataLength / 4
)
View Source
const (

	// SecondsPerMinute is number of seconds per minute
	SecondsPerMinute = 60
	// SecondsPerHour is number of seconds per hour
	SecondsPerHour = SecondsPerMinute * 60
	// SecondsPerDay is number of secods per day
	SecondsPerDay = SecondsPerHour * 24
	// SecondsPer4Day is number of seconds per 4 days
	SecondsPer4Day = SecondsPerDay * 4
	// DaysPerWeek is number of days per week
	DaysPerWeek = 7
	// WeekdayOffset is to compensate 1970-01-01 being a Thursday
	WeekdayOffset = 4
	// SecondsPerWeek is number of seconds per week
	SecondsPerWeek = SecondsPerDay * DaysPerWeek
)

Variables

View Source
var BucketSizeToseconds = map[string]int{
	"m": SecondsPerMinute,
	"h": SecondsPerHour,
	"d": SecondsPerDay,
}

BucketSizeToseconds is the map from normalized bucket unit to number of seconds

Functions

func CalculateEnumCasesBytes

func CalculateEnumCasesBytes(enumCases []string) uint32

CalculateEnumCasesBytes calculates how many bytes the enum case values will occupy including 8 bytes alignment.

func GetDimensionStartOffsets

func GetDimensionStartOffsets(numDimsPerDimWidth DimCountsPerDimWidth, dimIndex int, length int) (valueOffset, nullOffset int)

GetDimensionStartOffsets calculates the value and null starting position for given dimension inside dimension vector dimIndex is the ordered index of given dimension inside the dimension vector

func ReadDimension

func ReadDimension(valueStart, nullStart unsafe.Pointer,
	index int, dataType memCom.DataType, enumReverseDict []string, meta *TimeDimensionMeta, cache map[TimeDimensionMeta]map[int64]string) *string

ReadDimension reads a dimension value given the index and corresponding data type of node. tzRemedy is used to remedy the timezone offset

Types

type AQLTimeSeriesResult

type AQLTimeSeriesResult map[string]interface{}

AQLTimeSeriesResult is ported from Apollo, see time_series_result.go

Represents a nested AQL time series result with one dimension on each layer:

  • there is always an outermost time dimension. it stores the start time of the bucket/duration (in seconds since Epoch).
  • after the time dimension, there could be zero or more layers of additional dimensions (all values are represented as strings). a special "NULL" string

/ is used to represent NULL values.

  • there is always a single measure, and the measure type is either float64 or nil (not *float64);

func NewTimeSeriesHLLResult

func NewTimeSeriesHLLResult(buffer []byte, magicHeader uint32) (AQLTimeSeriesResult, error)

NewTimeSeriesHLLResult creates a new NewTimeSeriesHLLResult and deserialize the buffer into the result.

func ParseHLLQueryResults

func ParseHLLQueryResults(data []byte) (queryResults []AQLTimeSeriesResult, queryErrors []error, err error)

ParseHLLQueryResults will parse the response body into a slice of query results and a slice of errors.

func (AQLTimeSeriesResult) Set

func (r AQLTimeSeriesResult) Set(dimValues []*string, measureValue *float64)

Set is ported from Apollo, see time_series_result.go

func (AQLTimeSeriesResult) SetHLL

func (r AQLTimeSeriesResult) SetHLL(dimValues []*string, hll HLL)

SetHLL sets hll struct to be the leaves of the nested map.

type DimCountsPerDimWidth

type DimCountsPerDimWidth [5]uint8

DimCountsPerDimWidth defines dimension counts per dimension width 16-byte 8-byte 4-byte 2-byte 1-byte

type HLL

type HLL struct {
	SparseData       []HLLRegister // Unsorted registers.
	DenseData        []byte        // Rho by register index.
	NonZeroRegisters uint16
}

HLL stores only the dense data for now.

func (*HLL) Compute

func (hll *HLL) Compute() float64

Compute computes the result of the HLL.

func (*HLL) ConvertToDense

func (hll *HLL) ConvertToDense()

ConvertToDense converts the HLL to dense format.

func (*HLL) ConvertToSparse

func (hll *HLL) ConvertToSparse() bool

ConvertToSparse try converting the hll to sparse format if it turns out to be cheaper.

func (*HLL) Decode

func (hll *HLL) Decode(data []byte)

Decode decodes the HLL from cache cache. Interprets as dense or sparse format based on len(data).

func (*HLL) Encode

func (hll *HLL) Encode() []byte

Encode encodes the HLL for cache storage. Dense format will have a length of 1<<hllP. Sparse format will have a smaller length

func (*HLL) Merge

func (hll *HLL) Merge(other HLL)

Merge merges (using max(rho)) the other HLL (sparse or dense) into this one (will be converted to dense).

func (*HLL) Set

func (hll *HLL) Set(index uint16, rho byte)

Set sets rho for the specified register index. Caller must ensure that each register is set no more than once.

type HLLData

type HLLData struct {
	NumDimsPerDimWidth             DimCountsPerDimWidth
	ResultSize                     uint32
	PaddedRawDimValuesVectorLength uint32
	PaddedHLLVectorLength          int64

	DimIndexes []int
	DataTypes  []memCom.DataType
	// map from column id => enum cases. It will
	// only include columns used in dimensions.
	EnumDicts map[int][]string
}

HLLData stores fields for serialize and deserialize an hyperloglog query result when client sets Content-Accept header to be application/hll. The serialized buffer of a hll data is in following format:

 [uint32] magic_number [uint32] padding

-----------query result 0-------------------
 <header>
 [uint32] query result 0 size [uint8] error or result [3 bytes padding]
 [uint8] num_enum_columns [uint8] bytes per dim ... [padding for 8 bytes]
 [uint32] result_size [uint32] raw_dim_values_vector_length
 [uint8] dim_index_0... [uint8] dim_index_n [padding for 8 bytes]
 [uint32] data_type_0...[uint32] data_type_n [padding for 8 bytes]

 <enum cases 0>
 [uint32_t] number of bytes of enum cases [uint16] column_index [2 bytes: padding]
 <enum values 0> delimited by "\u0000\n" [padding for 8 bytes]
 <end of header>
 <raw dim values vector>
 ...
 [padding for 8 byte alignment]

 <raw hll dense vector>
 ...
------------error 1----------
 [uint32] query result 1 size  [uint8] error or result [3 bytes padding]
...

func (*HLLData) CalculateSizes

func (data *HLLData) CalculateSizes() (uint32, int64)

CalculateSizes returns the header size and total size of used by this hll data.

type HLLRegister

type HLLRegister struct {
	Index uint16 `json:"index"`
	Rho   byte   `json:"rho"`
}

HLLRegister is the register used in the sparse representation.

type TimeDimensionMeta

type TimeDimensionMeta struct {
	TimeBucketizer  string
	TimeUnit        string
	IsTimezoneTable bool
	TimeZone        *time.Location
	DSTSwitchTs     int64
	FromOffset      int
	ToOffset        int
}

TimeDimensionMeta is the aggregation of meta data needed to format time dimensions

type TimeSeriesBucketizer

type TimeSeriesBucketizer struct {
	Size int
	Unit string
}

TimeSeriesBucketizer is the helper struct to express parsed time bucketizer, see comment below

func ParseRegularTimeBucketizer

func ParseRegularTimeBucketizer(timeBucketizerString string) (TimeSeriesBucketizer, error)

ParseRegularTimeBucketizer tries to convert a regular time bucketizer(anything below month) input string to a (Size, Unit) pair, reports error if input is invalid/unsupported. e.g. "3m" -> (3, "m") "4 hours" -> (4, "h")

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL