pdk

package module
v0.0.0-...-2374f08 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 11, 2018 License: BSD-3-Clause Imports: 22 Imported by: 0

README

pdk

Pilosa Dev Kit - implementation tooling and use case examples are here!

Documentation is here: https://www.pilosa.com/docs/pdk/

Requirements

Installing

We assume you are on a UNIX-like operating system. Otherwise adapt the following instructions for your platform. We further assume that you have a Go development environment set up. You should have $GOPATH/bin on your $PATH for access to installed binaries.

  • go get github.com/pilosa/pdk
  • cd $GOPATH/src/github.com/pilosa/pdk
  • make install

Running Tests

To run unit tests make test

To run unit and integration tests, first, install and start the Confluent stack:

  1. Download tarball here: https://www.confluent.io/download
  2. Decompress, enter directory, then,
  3. Run ./bin/confluent start kafka-rest Now that's running, you can do make test TESTFLAGS="-tags=integration"

Taxi usecase

To get started immediately, run this:

pdk taxi

This will create and fill an index called taxi, using the short url list in usecase/taxi/urls-short.txt.

If you want to try out the full data set, run this:

pdk taxi -i taxi-big -f usecase/taxi/greenAndYellowUrls.txt

There are a number of other options you can tweak to affect the speed and memory usage of the import (or point it to a remote pilosa instance). Use pdk taxi -help to see all the options.

Note that this url file represents 1+ billion columns of data - depending on your hardware this will probably take well over 3 hours, and consume quite a bit of memory (and CPU). You can make a file with fewer URLs if you just want to get a sample.

After importing, you can try a few example queries at https://github.com/alanbernstein/pilosa-notebooks/blob/master/taxi-use-case.ipynb .

Net usecase

To get started immediately, run this:

pdk net -i en0

which will capture traffic on the interface en0 (see available interfaces with ifconfig).

SSB

The Star Schema Benchmark is a benchmark based on TPC-H but tweaked for a somewhat difference use case. It has been implemented by some big data projects such as https://hortonworks.com/blog/sub-second-analytics-hive-druid/ .

To execute the star schema benchmark with Pilosa, you must.

  1. Generate the SSB data at a particular scale factor.
  2. Import the data into Pilosa.
  3. Run the demo-ssb application for convenience which has all of the SSB queries pre-written.
Generating SSB data

Use https://github.com/electrum/ssb-dbgen.git to generate the raw SSB data. This can be a bit finicky to work with - hit up @tgruben for tips (or maybe he'll update this section 😉.

When generating the data, you have to select a particular scale factor - the size of the generated data will be about 600MB * SF(scale factor), so SF=100 will generate about 60GB of data.

Import data into Pilosa.

Use pdk ssb to import the data into Pilosa. You must specify the directory containing the .tbl files generated in the first step as well as the location of your pilosa cluster. There are a few other options which you can tweak which may help import performance. See pdk ssb -h for more information.

Run demo-ssb

This repo https://github.com/pilosa/demo-ssb.git contains a small Go program which packages up the different queries which comprise the benchmark. Running demo-ssb starts a web server which executes queries against pilosa on your behalf. You can simply run (e.g.) curl localhost:8000/query/1.1 to run an SSB query.

Documentation

Overview

package pdk is the Pilosa Development Kit! It contains various helper functions and documentation to assist in using pilosa.

Of principal importance in the PDK is the ingest pipeline. Interfaces and basic implementations of each stage listed below are included in the PDK, and a number of more sophisticated implementations which may rely on other software are in sub-packages of the PDK.

1. Source

A pdk.Source is at the beginning of every indexing journey. We know
you, and we know your data is everywhere - S3 buckets, local files, Kafka
topics, hard-coded in tests, SQL databases, document DBs, triple stores.
Different Sources know how to interact with the various systems holding
your data, and get it out, one piece at a time, all wrapped up behind one
convenient interface. To write a new Source, simply implement the Source
interface, returning whatever comes naturally from the underlying client
library or API with which you are interacting. It is not the job of the
source to manipulate or massage the data in any way - that job falls to
the Parser which is the next stage of the ingestion journey. The reason
for this separation is twofold: first, you may get the same type of data
from many different sources, so it may be convenient to couple one parser
to several different sources. Secondly, you may require different
concurrency or scaling properties from fetching the data vs parsing it.
For example, if you are interacting with an HTTP endpoint at significant
latency, you way want many routines issuing concurrent calls in order to
achieve the desired throughput, but parsing is relatively lightweight, and
a single routine is sufficient to process the load.

2. Parser

The Parser does the heavy lifting for turning some arbitrary type of data
into something slightly more structured, recognizeable, and type-safe.
There are many choices to be made when indexing data in Pilosa around
tradeoffs like speed vs precision, or storage size. When to use bucketing
vs range encoding, when time quantum support is needed and at what
granularity, etc. These things are not the job of the Parser. The Parser
should only get the data into a regular format so that the Mapper can make
those tradeoffs without having to worry excessively over decoding the
data. The Parser must convert incoming data into an RDF-triple like
representation using a handful of supported basic values detailed in
entity.go. Determining how to collapse (e.g.) arbitrary JSON data
into this format is not a trivial task, and indeed there may be multiple
ways to go about it and so it is possible that multiple parsers may exist
which operate on the same type of Source data.

2.5. Transformer

One may optional provide a number of Transformers which do in-place
operations on the Entity before it is passed to the Mapper.

3. Mapper

The Mapper's job is to take instances of pdk.Entity and create
pdk.PilosaRecord objects. Because the pdk.Entity is fairly well-defined,
it is possible to do this generically, and it may not be necessary to use
a bespoke Mapper in many cases. However, as mentioned in the Parser
description, there are performance and capability tradeoffs based on how
one decides to map data into Pilosa. (TODO expand with more examples as
mappers are implemented, also reference generic mapper and its config
options)

4. Indexer

The Indexer is responsible for getting data into Pilosa. Primarily, there
is a latency/throuput tradeoff depending on the batch size selected.

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrPathNotFound   = errors.New("path not found in Entity")
	ErrNotALiteral    = errors.New("value at path is not a Literal")
	ErrEmptyPath      = errors.New("path is empty")
	ErrUnexpectedType = errors.New("unexpected type")
)

Functions

func GetFields

func GetFields(body []byte) ([]string, error)

GetFields interprets body as pql queries and then tries to determine the field of each. Some queries do not have fields, and the empty string will be returned for these.

func Int64ize

func Int64ize(val Literal) int64

func NewIntField

func NewIntField(index *gopilosa.Index, name string, min, max int64) *gopilosa.Field

func NewPilosaForwarder

func NewPilosaForwarder(phost string, t Translator, colTranslator ...FieldTranslator) *pilosaForwarder

NewPilosaForwarder returns a new pilosaForwarder which forwards all requests to `phost`. It inspects pilosa responses and runs the row ids through the Translator `t` to translate them to whatever they were mapped from.

func NewPilosaProxy

func NewPilosaProxy(host string, client *http.Client) *pilosaProxy

NewPilosaProxy returns a pilosaProxy based on `host` and `client`.

func NewRankedField

func NewRankedField(index *gopilosa.Index, name string, size int) *gopilosa.Field

func NexterStartFrom

func NexterStartFrom(s uint64) func(n *Nexter)

NexterStartFrom returns an option which makes a Nexter start from integer "s".

func StartMappingProxy

func StartMappingProxy(bind string, h http.Handler) error

StartMappingProxy listens for incoming http connections on `bind` and and uses h to handle all requests. This function does not return unless there is a problem (like http.ListenAndServe).

func ToBytes

func ToBytes(l Literal) []byte

ToBytes converts a literal into a typed byte slice representation.

func ToString

func ToString(l Literal) string

ToString converts a Literal into a string with a type byte prepended.

func Walk

func Walk(e *Entity, call func(path []string, l Literal) error) error

Walk recursively visits every Object in the Entity and calls "call" with every Literal and it's path.

Types

type AttrMapper

type AttrMapper struct {
	Mapper  Mapper
	Parsers []Parser
	Fields  []int
}

AttrMapper is a struct for mapping some set of data fields to a value for sending to Pilosa as a SetColumnAttrs query

type B

type B bool

func (B) MarshalJSON

func (B B) MarshalJSON() ([]byte, error)

type BinaryFloatMapper

type BinaryFloatMapper struct {
	Min      float64
	Max      float64
	BitDepth int
	// contains filtered or unexported fields
}

BinaryFloatMapper is a Mapper for float types, mapping to a set of buckets representing the value in a binary sense

func (BinaryFloatMapper) ID

func (m BinaryFloatMapper) ID(fi ...interface{}) (rowIDs []int64, err error)

ID maps floats to binary column sets

type BinaryIntMapper

type BinaryIntMapper struct {
	Min      int64
	Max      int64
	BitDepth int
	// contains filtered or unexported fields
}

BinaryIntMapper is a Mapper for int types, mapping to a set of buckets representing the value in a binary sense

func (BinaryIntMapper) ID

func (m BinaryIntMapper) ID(ii ...interface{}) (rowIDs []int64, err error)

ID maps floats to binary column sets

type BlankSubjecter

type BlankSubjecter struct{}

BlankSubjecter is a Subjecter which always returns an empty subject. Typically this means that a sequential ID will be generated for each record.

func (BlankSubjecter) Subject

func (b BlankSubjecter) Subject(d interface{}) (string, error)

Subject implements Subjecter, and always returns an empty string and nil error.

type BoolMapper

type BoolMapper struct {
}

BoolMapper is a trivial Mapper for boolean types

func (BoolMapper) ID

func (m BoolMapper) ID(bi ...interface{}) (rowIDs []int64, err error)

ID maps a bool to a rowID (identity mapper)

type Bytes

type Bytes uint64

Bytes is a wrapper type for numbers which represent bytes. It provides a String method which produces sensible readable output like 1.2G or 4M, etc.

func (Bytes) String

func (b Bytes) String() string

Returns a human-readable byte string of the form 10M, 12.5K, and so forth. The following units are available:

T: Terabyte
G: Gigabyte
M: Megabyte
K: Kilobyte
B: Byte

The unit that results in the smallest number greater than or equal to 1 is always chosen.

type CollapsingMapper

type CollapsingMapper struct {
	Translator    Translator
	ColTranslator FieldTranslator
	Framer        Framer
	Nexter        INexter
}

CollapsingMapper processes Entities into PilosaRecords by walking the tree of properties and collapsing every path down to a concrete value into a single property name.

func NewCollapsingMapper

func NewCollapsingMapper() *CollapsingMapper

NewCollapsingMapper returns a CollapsingMapper with basic implementations of its components. In order to track mapping of Pilosa columns to records, you must replace the ColTranslator with something other than a NexterFieldTranslator which just allocates ids and does not store a mapping.

func (*CollapsingMapper) Map

func (m *CollapsingMapper) Map(e *Entity) (PilosaRecord, error)

Map implements the RecordMapper interface.

type ColumnMapper

type ColumnMapper struct {
	Field   string
	Mapper  Mapper
	Parsers []Parser
	Fields  []int
}

ColumnMapper is a struct for mapping some set of data fields to a (field, id) combination for sending to Pilosa as a SetBit query

type Context

type Context map[string]interface{}

type CustomMapper

type CustomMapper struct {
	Func   func(...interface{}) interface{}
	Mapper Mapper
}

CustomMapper is a Mapper that applies a function to a slice of fields, then applies a simple Mapper to the result of that, returning a rowID. This is a generic way to support mappings which span multiple fields. It is not supported by the importing config system.

func (CustomMapper) ID

func (m CustomMapper) ID(fields ...interface{}) (rowIDs []int64, err error)

ID maps a set of fields using a custom function

type DashField

type DashField struct {
	Ignore   []string `help:"Do not index paths containing any of these components"`
	Collapse []string `help:"Remove these components from the path before getting field."`
}

DashField creates a field name from the path by joining the path elements with the "-" character.

func (*DashField) Field

func (d *DashField) Field(path []string) (string, error)

Field gets a field from a path by joining the path elements with dashes.

type DayOfMonthMapper

type DayOfMonthMapper struct {
}

func (DayOfMonthMapper) ID

func (m DayOfMonthMapper) ID(ti ...interface{}) (rowIDs []int64, err error)

ID maps a timestamp to a day of month bucket (1-31)

type DayOfWeekMapper

type DayOfWeekMapper struct {
}

DayOfWeekMapper is a Mapper for timestamps, mapping the day of week only

func (DayOfWeekMapper) ID

func (m DayOfWeekMapper) ID(ti ...interface{}) (rowIDs []int64, err error)

ID maps a timestamp to a day of week bucket

type Entity

type Entity struct {
	Subject IRI `json:"@id"`
	Objects map[Property]Object
}

Entity is the "root" node of a graph branching out from a certain resource denoted by the Subject. This is a convenience vs just handling a list of Triples as we expect to structure indexing around a particular class of thing which we ingest many instances of as records.

func NewEntity

func NewEntity() *Entity

NewEntity returns a newly allocated Entity.

func (*Entity) Equal

func (e *Entity) Equal(e2 *Entity) error

func (*Entity) F64

func (e *Entity) F64(path ...string) (F64, error)

F64 tries to get a float64 at the given path in the Entity.

func (*Entity) Literal

func (e *Entity) Literal(path ...string) (Literal, error)

Literal gets the literal at the path in the Entity.

func (*Entity) MarshalJSON

func (e *Entity) MarshalJSON() ([]byte, error)

MarshalJSON is a custom JSON marshaler for Entity objects to ensure that they serialize to valid JSON-LD (https://json-ld.org/ spec/latest/json-ld/). This allows for easy (if not particularly performant) interoperation with other variants of RDF linked data.

func (*Entity) SetPath

func (e *Entity) SetPath(path ...string) (*Entity, error)

SetPath ensures that a path exists, creating Entities along the way if necessary. If it encounters a non-Entity, it will return an error.

func (*Entity) SetString

func (e *Entity) SetString(value string, path ...string) error

SetString

type EntitySubjecter

type EntitySubjecter interface {
	Subject(e *Entity) (string, error)
}

EntitySubjecter is an alternate interface for getting the Subject of a record, which operates on the parsed Entity rather than the unparsed data.

type EntityWithContext

type EntityWithContext struct {
	Entity
	Context Context `json:"@context"`
}

EntityWithContext associates a Context (https://json-ld.org/spec/latest/json-ld/#the-context) with an Entity so that it can be Marshaled to valid and useful JSON-LD.

type F32

type F32 float32

func (F32) MarshalJSON

func (F F32) MarshalJSON() ([]byte, error)

type F64

type F64 float64

func (F64) MarshalJSON

func (F F64) MarshalJSON() ([]byte, error)

type FieldTranslator

type FieldTranslator interface {
	Get(id uint64) (interface{}, error)
	GetID(val interface{}) (uint64, error)
}

FieldTranslator works like a Translator, but the methods don't take fields as arguments. Typically a Translator will include a FieldTranslator for each field.

type FileFragment

type FileFragment struct {
	// contains filtered or unexported fields
}

FileFragment implements io.ReadCloser for part of a file.

func NewFileFragment

func NewFileFragment(f *os.File, startLoc, endLoc int64) (*FileFragment, error)

NewFileFragment returns a FileFragment which will read only from startLoc to endLoc in a file.

func SplitFileLines

func SplitFileLines(f *os.File, numParts int64) ([]*FileFragment, error)

SplitFileLines returns a slice of file fragments which is numParts in length. Each FileFragment will read a different section of the file, but the split points are guaranteed to be on line breaks.

func (*FileFragment) Close

func (ff *FileFragment) Close() error

Close implements io.Closer for a FileFragment.

func (*FileFragment) Read

func (ff *FileFragment) Read(b []byte) (n int, err error)

Read implements io.Reader for FileFragment.

type FloatMapper

type FloatMapper struct {
	Buckets []float64 // slice representing bucket intervals [left0 left1 ... leftN-1 rightN-1]
	// contains filtered or unexported fields
}

FloatMapper is a Mapper for float types, mapping to arbitrary buckets

func (FloatMapper) ID

func (m FloatMapper) ID(fi ...interface{}) (rowIDs []int64, err error)

ID maps floats to arbitrary buckets

type FloatParser

type FloatParser struct {
}

FloatParser is a parser for float types

func (FloatParser) Parse

func (p FloatParser) Parse(field string) (result interface{}, err error)

Parse parses a float string to a float64 value

type Framer

type Framer interface {
	// The Field method should return an empty string and a nil error if the value
	// at the given path should be ignored. It should return an error, only if
	// something unexpected has occurred which means the record cannot be properly
	// processed.
	Field(path []string) (field string, err error)
}

Framer is an interface for extracting field names from paths denoted by []string. The path could be (e.g.) a list of keys in a nested map which arrives at a non-container value (string, int, etc).

type FramerFunc

type FramerFunc func([]string) (string, error)

FramerFunc is similar to http.HandlerFunc in that you can make a bare function satisfy the Framer interface by doing FramerFunc(yourfunc).

func (FramerFunc) Field

func (f FramerFunc) Field(path []string) (string, error)

Field on FramerFunc simply calls the wrapped function.

type GenericParser

type GenericParser struct {
	Subjecter       Subjecter
	EntitySubjecter EntitySubjecter
	SubjectAll      bool

	// IncludeUnexportedFields controls whether unexported struct fields will be
	// included when parsing.
	IncludeUnexportedFields bool // TODO: I don't think this actually works

	// Strict controls whether failure to parse a single value or key will cause
	// the entire record to fail.
	Strict bool

	Stats Statter
	Log   Logger
}

GenericParser tries to make no assumptions about the value passed to its Parse method. At the top level it accepts a map or struct (or pointer or interface holding one of these).

func NewDefaultGenericParser

func NewDefaultGenericParser() *GenericParser

NewDefaultGenericParser returns a GenericParser with basic implementations of its components. In order to track the mapping of Pilosa columns to records, you must replace the Subjecter with something other than a BlankSubjecter.

func (*GenericParser) Parse

func (m *GenericParser) Parse(data interface{}) (e *Entity, err error)

Parse of the GenericParser tries to parse any value into a pdk.Entity.

type GridMapper

type GridMapper struct {
	Xmin float64
	Xmax float64
	Xres int64
	Ymin float64
	Ymax float64
	Yres int64
	// contains filtered or unexported fields
}

GridMapper is a Mapper for a 2D grid (e.g. small-scale latitude/longitude)

func (GridMapper) ID

func (m GridMapper) ID(xyi ...interface{}) (rowIDs []int64, err error)

ID maps pairs of floats to regular buckets

type GridToFloatMapper

type GridToFloatMapper struct {
	// contains filtered or unexported fields
}

func NewGridToFloatMapper

func NewGridToFloatMapper(gm GridMapper, lfm LinearFloatMapper, gridVals []float64) GridToFloatMapper

func (GridToFloatMapper) ID

func (m GridToFloatMapper) ID(vals ...interface{}) ([]int64, error)

type I

type I int

func (I) MarshalJSON

func (I I) MarshalJSON() ([]byte, error)

func (*I) UnmarshalJSON

func (i *I) UnmarshalJSON(b []byte) error

type I16

type I16 int16

func (I16) MarshalJSON

func (I I16) MarshalJSON() ([]byte, error)

func (*I16) UnmarshalJSON

func (i *I16) UnmarshalJSON(b []byte) error

type I32

type I32 int32

func (I32) MarshalJSON

func (I I32) MarshalJSON() ([]byte, error)

func (*I32) UnmarshalJSON

func (i *I32) UnmarshalJSON(b []byte) error

type I64

type I64 int64

func (I64) MarshalJSON

func (I I64) MarshalJSON() ([]byte, error)

func (*I64) UnmarshalJSON

func (i *I64) UnmarshalJSON(b []byte) error

type I8

type I8 int8

func (I8) MarshalJSON

func (I I8) MarshalJSON() ([]byte, error)

func (*I8) UnmarshalJSON

func (i *I8) UnmarshalJSON(b []byte) error

type INexter

type INexter interface {
	Next() uint64
	Last() uint64
}

INexter is the horribly named interface for threadsafe, monotonic, sequential, unique ID generation.

type IPParser

type IPParser struct {
}

IPParser is a parser for IP addresses

func (IPParser) Parse

func (p IPParser) Parse(field string) (result interface{}, err error)

Parse parses an IP string into a TODO

type IRI

type IRI string

IRI is either a full IRI, or will map to one when the record in which it is contained is processed in relation to a context: (https://json-ld.org/spec/latest/json-ld/#the-context)

type Index

type Index struct {
	// contains filtered or unexported fields
}

func (*Index) AddColumn

func (i *Index) AddColumn(field string, col uint64, row uint64)

AddColumn adds a column to be imported to Pilosa.

func (*Index) AddColumnTimestamp

func (i *Index) AddColumnTimestamp(field string, row, col uint64, ts time.Time)

AddColumnTimestamp adds a column to be imported to Pilosa with a timestamp.

func (*Index) AddValue

func (i *Index) AddValue(fieldName string, col uint64, val int64)

AddValue adds a value to be imported to Pilosa.

func (*Index) Client

func (i *Index) Client() *gopilosa.Client

Client returns a Pilosa client.

func (*Index) Close

func (i *Index) Close() error

Close ensures that all ongoing imports have finished and cleans up internal state.

type Indexer

type Indexer interface {
	AddColumn(field string, col, row uint64)
	AddColumnTimestamp(field string, col, row uint64, ts time.Time)
	AddValue(field string, col uint64, val int64)
	// AddRowAttr(field string, row uint64, key string, value AttrVal)
	// AddColAttr(col uint64, key string, value AttrVal)
	Close() error
	Client() *gopilosa.Client
}

Indexer puts stuff into Pilosa.

func SetupPilosa

func SetupPilosa(hosts []string, indexName string, schema *gopilosa.Schema, batchsize uint) (Indexer, error)

SetupPilosa returns a new Indexer after creating the given fields and starting importers.

type Ingester

type Ingester struct {
	ParseConcurrency int

	Transformers  []Transformer
	AllowedFields map[string]bool

	Stats Statter
	Log   Logger
	// contains filtered or unexported fields
}

Ingester combines a Source, Parser, Mapper, and Indexer, and uses them to ingest data into Pilosa. This could be a streaming situation where the Source never ends, and calling it just waits for more data to be available, or a batch situation where the Source eventually returns io.EOF (or some other error), and the Ingester completes (after the other components are done).

func NewIngester

func NewIngester(source Source, parser RecordParser, mapper RecordMapper, indexer Indexer) *Ingester

NewIngester gets a new Ingester.

func (*Ingester) Run

func (n *Ingester) Run() error

Run runs the ingest.

type IntMapper

type IntMapper struct {
	Min int64
	Max int64
	Res int64 // number of bins
	// contains filtered or unexported fields
}

IntMapper is a Mapper for integer types, mapping each int in the range to a row

func (IntMapper) ID

func (m IntMapper) ID(ii ...interface{}) (rowIDs []int64, err error)

ID maps an int range to a rowID range

type IntParser

type IntParser struct {
}

IntParser is a parser for integer types

func (IntParser) Parse

func (p IntParser) Parse(field string) (result interface{}, err error)

Parse parses an integer string to an int64 value

type KeyMapper

type KeyMapper interface {
	MapRequest(body []byte) ([]byte, error)
	MapResult(field string, res interface{}) (interface{}, error)
}

KeyMapper describes the functionality for mapping the keys contained in requests and responses.

type LinearFloatMapper

type LinearFloatMapper struct {
	Min   float64
	Max   float64
	Res   float64
	Scale string // linear, logarithmic
	// contains filtered or unexported fields
}

LinearFloatMapper is a Mapper for float types, mapping to regularly spaced buckets TODO: consider defining this in terms of a linear mapping ID = floor(a*value + b)

func (LinearFloatMapper) ID

func (m LinearFloatMapper) ID(fi ...interface{}) (rowIDs []int64, err error)

ID maps floats to regularly spaced buckets

type Literal

type Literal interface {
	// contains filtered or unexported methods
}

Literal interface is implemented by types which correspond to RDF Literals.

func FromBytes

func FromBytes(bs []byte) Literal

FromBytes converts an encoded byte slice (from ToBytes) back to a Literal. DEV: May add an error and bounds checking.

func FromString

func FromString(s string) Literal

FromString converts a Literal encoded with ToString back to a Literal.

type Logger

type Logger interface {
	Printf(format string, v ...interface{})
	Debugf(format string, v ...interface{})
}

Logger is the interface that loggers must implement to get PDK logs.

type MapFieldTranslator

type MapFieldTranslator struct {
	// contains filtered or unexported fields
}

MapFieldTranslator is an in-memory implementation of FrameTranslator using sync.Map and a slice.

func NewMapFieldTranslator

func NewMapFieldTranslator() *MapFieldTranslator

NewMapFieldTranslator creates a new MapFrameTranslator.

func (*MapFieldTranslator) Get

func (m *MapFieldTranslator) Get(id uint64) (interface{}, error)

Get returns the value mapped to the given id.

func (*MapFieldTranslator) GetID

func (m *MapFieldTranslator) GetID(val interface{}) (id uint64, err error)

GetID returns the integer id associated with the given value. It allocates a new ID if the value is not found.

type MapTranslator

type MapTranslator struct {
	// contains filtered or unexported fields
}

MapTranslator is an in-memory implementation of Translator using maps.

func NewMapTranslator

func NewMapTranslator() *MapTranslator

NewMapTranslator creates a new MapTranslator.

func (*MapTranslator) Get

func (m *MapTranslator) Get(field string, id uint64) (interface{}, error)

Get returns the value mapped to the given id in the given field.

func (*MapTranslator) GetID

func (m *MapTranslator) GetID(field string, val interface{}) (id uint64, err error)

GetID returns the integer id associated with the given value in the given field. It allocates a new ID if the value is not found.

type Mapper

type Mapper interface {
	ID(...interface{}) ([]int64, error)
}

Mapper represents a single method for mapping a specific data type to a slice of row IDs. A data type might be composed of multiple fields (e.g. a 2D point). A data type might use multiple mappers.

type MonthMapper

type MonthMapper struct {
}

MonthMapper is a Mapper for timestamps, mapping the month only

func (MonthMapper) ID

func (m MonthMapper) ID(ti ...interface{}) (rowIDs []int64, err error)

ID maps a timestamp to a month bucket (1-12)

type Nexter

type Nexter struct {
	// contains filtered or unexported fields
}

Nexter is a threadsafe monotonic unique id generator

func NewNexter

func NewNexter(opts ...NexterOption) *Nexter

NewNexter creates a new id generator starting at 0 - can be modified by options.

func (*Nexter) Last

func (n *Nexter) Last() (lastID uint64)

Last returns the most recently generated id

func (*Nexter) Next

func (n *Nexter) Next() (nextID uint64)

Next generates a new id and returns it

type NexterFrameTranslator

type NexterFrameTranslator struct {
	// contains filtered or unexported fields
}

NexterFrameTranslator satisfies the FrameTranslator interface, but simply allocates a new contiguous id every time GetID(val) is called. It does not store any mapping and Get(id) always returns an error. Pilosa requires column ids regardless of whether we actually require tracking what each individual column represents, and the NexterFrameTranslator is useful in the case that we don't.

func NewNexterFieldTranslator

func NewNexterFieldTranslator() *NexterFrameTranslator

NewNexterFieldTranslator creates a new NexterFrameTranslator

func (*NexterFrameTranslator) Get

func (n *NexterFrameTranslator) Get(id uint64) (interface{}, error)

Get always returns nil, and a non-nil error for the NexterFrameTranslator.

func (*NexterFrameTranslator) GetID

func (n *NexterFrameTranslator) GetID(val interface{}) (id uint64, err error)

GetID for the NexterFrameTranslator increments the internal id counter atomically and returns the next id - it ignores the val argument entirely.

type NexterOption

type NexterOption func(n *Nexter)

NexterOption can be passed to NewNexter to modify the Nexter's behavior.

type NopLogger

type NopLogger struct{}

NopLogger logs nothing.

func (NopLogger) Debugf

func (NopLogger) Debugf(format string, v ...interface{})

Debugf does nothing.

func (NopLogger) Printf

func (NopLogger) Printf(format string, v ...interface{})

Printf does nothing.

type NopStatter

type NopStatter struct{}

NopStatter does nothing.

func (NopStatter) Count

func (NopStatter) Count(name string, value int64, rate float64, tags ...string)

Count does nothing.

func (NopStatter) Gauge

func (NopStatter) Gauge(name string, value float64, rate float64, tags ...string)

Gauge does nothing.

func (NopStatter) Histogram

func (NopStatter) Histogram(name string, value float64, rate float64, tags ...string)

Histogram does nothing.

func (NopStatter) Set

func (NopStatter) Set(name string, value string, rate float64, tags ...string)

Set does nothing.

func (NopStatter) Timing

func (NopStatter) Timing(name string, value time.Duration, rate float64, tags ...string)

Timing does nothing.

type Object

type Object interface {
	// contains filtered or unexported methods
}

Object is an interface satisfied by all things which may appear as objects in RDF triples. All literals are objects, but not all objects are literals.

type Objects

type Objects []Object

type Parser

type Parser interface {
	Parse(string) (interface{}, error)
}

Parser represents a single method for parsing a string field to a value

type PilosaKeyMapper

type PilosaKeyMapper struct {
	// contains filtered or unexported fields
}

PilosaKeyMapper implements the KeyMapper interface.

func NewPilosaKeyMapper

func NewPilosaKeyMapper(t Translator, colTranslator ...FieldTranslator) *PilosaKeyMapper

NewPilosaKeyMapper returns a PilosaKeyMapper.

func (*PilosaKeyMapper) MapRequest

func (p *PilosaKeyMapper) MapRequest(body []byte) ([]byte, error)

MapRequest takes a request body and returns a mapped version of that body.

func (*PilosaKeyMapper) MapResult

func (p *PilosaKeyMapper) MapResult(field string, res interface{}) (mappedRes interface{}, err error)

MapResult converts the result of a single top level query (one element of QueryResponse.Results) to its mapped counterpart.

type PilosaRecord

type PilosaRecord struct {
	Col  uint64
	Rows []Row
	Vals []Val
}

PilosaRecord represents a number of set columns and values in a single Column in Pilosa.

func (*PilosaRecord) AddRow

func (pr *PilosaRecord) AddRow(field string, id uint64)

AddRow adds a new column to be set to the PilosaRecord.

func (*PilosaRecord) AddRowTime

func (pr *PilosaRecord) AddRowTime(field string, id uint64, ts time.Time)

AddRowTime adds a new column to be set with a timestamp to the PilosaRecord.

func (*PilosaRecord) AddVal

func (pr *PilosaRecord) AddVal(field string, value int64)

AddVal adds a new value to be range encoded into the given field to the PilosaRecord.

type Point

type Point struct {
	X float64
	Y float64
}

Point is a point in a 2D space

type Properteer

type Properteer interface {
	Property() Property
}

Properteer is the interface which should be implemented by types which want to explicitly define how they should be interpreted as a string for use as a Property when they are used as a map key.

type Property

type Property string

Property represents a Predicate, and can be turned into a Predicate IRI by a context

type Proxy

type Proxy interface {
	ProxyRequest(orig *http.Request, origbody []byte) (*http.Response, error)
}

Proxy describes the functionality for proxying requests.

type RecordMapper

type RecordMapper interface {
	Map(record *Entity) (PilosaRecord, error)
}

RecordMapper is the interface for taking parsed records from the Parser and figuring out what bits and values to set in Pilosa. RecordMappers usually have a Translator and a Nexter for converting arbitrary values to monotonic integer ids and generating column ids respectively. Implementations should be thread safe.

type RecordParser

type RecordParser interface {
	Parse(data interface{}) (*Entity, error)
}

RecordParser is the interface for turning raw records from Source into Go objects. Implementations of Parser should be thread safe.

type Region

type Region struct {
	Vertices []Point
}

Region is a simple polygonal region of R2 space

type RegionMapper

type RegionMapper struct {
	Regions []Region
	// contains filtered or unexported fields
}

RegionMapper is a Mapper for a set of geometric regions (e.g. neighborhoods or states) TODO: generate regions by reading shapefile

type Row

type Row struct {
	Field string
	ID    uint64

	// Time is the timestamp for the column in Pilosa which is the intersection of
	// this row and the Column in the PilosaRecord which holds this row.
	Time time.Time
}

Row represents a column to set in Pilosa sans column id (which is held by the PilosaRecord containg the Row).

type S

type S string

type Source

type Source interface {
	Record() (interface{}, error)
}

Source is the interface for getting raw data one record at a time. Implementations of Source should be thread safe.

type SparseIntMapper

type SparseIntMapper struct {
	Min int64
	Max int64
	Map map[int64]int64
	// contains filtered or unexported fields
}

SparseIntMapper is a Mapper for integer types, mapping only relevant ints

func (SparseIntMapper) ID

func (m SparseIntMapper) ID(ii ...interface{}) (rowIDs []int64, err error)

ID maps arbitrary ints to a rowID range

type Statter

type Statter interface {
	Count(name string, value int64, rate float64, tags ...string)
	Gauge(name string, value float64, rate float64, tags ...string)
	Histogram(name string, value float64, rate float64, tags ...string)
	Set(name string, value string, rate float64, tags ...string)
	Timing(name string, value time.Duration, rate float64, tags ...string)
}

Statter is the interface that stats collectors must implement to get stats out of the PDK.

type StdLogger

type StdLogger struct {
	*log.Logger
}

StdLogger only prints on Printf.

func (StdLogger) Debugf

func (StdLogger) Debugf(format string, v ...interface{})

Debugf implements Logger interface, but prints nothing.

func (StdLogger) Printf

func (s StdLogger) Printf(format string, v ...interface{})

Printf implements Logger interface.

type StringContainsMapper

type StringContainsMapper struct {
	Matches []string // slice of strings to check for containment
}

StringContainsMapper is a Mapper for string types...

type StringMatchesMapper

type StringMatchesMapper struct {
	Matches []string // slice of strings to check for match
}

StringMatchesMapper is a Mapper for string types...

type StringParser

type StringParser struct {
}

StringParser is a parser for string types

func (StringParser) Parse

func (p StringParser) Parse(field string) (result interface{}, err error)

Parse is an identity parser for strings

type SubjectFunc

type SubjectFunc func(d interface{}) (string, error)

SubjectFunc is a wrapper like http.HandlerFunc which allows you to use a bare func as a Subjecter.

func (SubjectFunc) Subject

func (s SubjectFunc) Subject(d interface{}) (string, error)

Subject implements Subjecter.

type SubjectPath

type SubjectPath []string

SubjectPath is an EntitySubjecter which extracts a subject by walking the Entity properties denoted by the strings in SubjectPath.

func (SubjectPath) Subject

func (p SubjectPath) Subject(e *Entity) (string, error)

Subject implements EntitySubjecter.

type Subjecter

type Subjecter interface {
	Subject(d interface{}) (string, error)
}

Subjecter is an interface for getting the Subject of a record.

type Time

type Time time.Time

type TimeOfDayMapper

type TimeOfDayMapper struct {
	Res int64
}

TimeOfDayMapper is a Mapper for timestamps, mapping the time component only TODO: consider putting all time buckets in same field pros: single field cons: would have to abandon the simple ID interface. also single field may not be a good thing

func (TimeOfDayMapper) ID

func (m TimeOfDayMapper) ID(ti ...interface{}) (rowIDs []int64, err error)

ID maps a timestamp to a time of day bucket

type TimeParser

type TimeParser struct {
	Layout string
}

TimeParser is a parser for timestamps

func (TimeParser) Parse

func (p TimeParser) Parse(field string) (result interface{}, err error)

Parse parses a timestamp string to a time.Time value

type Transformer

type Transformer interface {
	Transform(e *Entity) error
}

Transformer is an interface for something which performs an in-place transformation on an Entity. It might enrich the entity by adding new fields, delete existing fields that don't need to be indexed, or change fields.

type TransformerFunc

type TransformerFunc func(*Entity) error

TransformerFunc can be wrapped around a function to make it implement the Transformer interface. Similar to http.HandlerFunc.

func (TransformerFunc) Transform

func (t TransformerFunc) Transform(e *Entity) error

Transform implements Transformer for TransformerFunc

type Translator

type Translator interface {
	Get(field string, id uint64) (interface{}, error)
	GetID(field string, val interface{}) (uint64, error)
}

Translator describes the functionality for mapping arbitrary values in a given Pilosa field to row ids and back. Implementations should be threadsafe and generate ids monotonically.

type U

type U uint

func (U) MarshalJSON

func (U U) MarshalJSON() ([]byte, error)

type U16

type U16 uint16

func (U16) MarshalJSON

func (U U16) MarshalJSON() ([]byte, error)

type U32

type U32 uint32

func (U32) MarshalJSON

func (U U32) MarshalJSON() ([]byte, error)

type U64

type U64 uint64

func (U64) MarshalJSON

func (U U64) MarshalJSON() ([]byte, error)

type U8

type U8 uint8

func (U8) MarshalJSON

func (U U8) MarshalJSON() ([]byte, error)

type Val

type Val struct {
	Field string
	Value int64
}

Val represents a BSI value to set in a Pilosa field sans column id (which is held by the PilosaRecord containing the Val).

type VerboseLogger

type VerboseLogger struct {
	*log.Logger
}

VerboseLogger prints on both Printf and Debugf.

func (VerboseLogger) Debugf

func (s VerboseLogger) Debugf(format string, v ...interface{})

Debugf implements Logger interface.

func (VerboseLogger) Printf

func (s VerboseLogger) Printf(format string, v ...interface{})

Printf implements Logger interface.

type YearMapper

type YearMapper struct {
	MinYear int64 // TODO? use this to eliminate empty rows for year < 2000 or whatever
}

YearMapper is a Mapper for timestamps, mapping the year only

func (YearMapper) ID

func (m YearMapper) ID(ti ...interface{}) (rowIDs []int64, err error)

ID maps a timestamp to a year bucket

Directories

Path Synopsis
aws
s3
Package boltdb provides a pdk.Translator implementation using boltdb.
Package boltdb provides a pdk.Translator implementation using boltdb.
cmd
pdk
gen
Package termstat provides a stats implementation which periodically logs the statistics to the given writer.
Package termstat provides a stats implementation which periodically logs the statistics to the given writer.
usecase
gen
ssb

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL