frames

package module
v0.6.8-v0.9.11 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 29, 2019 License: Apache-2.0 Imports: 19 Imported by: 1

README

V3IO Frames

GoDoc License

V3IO Frames ("Frames") is a multi-model open-source data-access library, developed by Iguazio, which provides a unified high-performance DataFrame API for working with data in the data store of the Iguazio Data Science Platform ("the platform").

In This Document

Client Python API Reference

Overview

Initialization

To use Frames, you first need to import the v3io_frames Python library. For example:

import v3io_frames as v3f

Then, you need to create and initialize an instance of the Client class; see Client Constructor. You can then use the client methods to perform different data operations on the supported backend types.

Backend Types

All Frames client methods receive a backend parameter for setting the Frames backend type. Frames supports the following backend types:

  • kv — a platform NoSQL (key/value) table.
  • stream — a platform data stream.
  • tsdb — a time-series database (TSDB).
  • csv — a comma-separated-value (CSV) file. This backend type is used only for testing purposes.

Client Methods

The Client class features the following methods for supporting basic data operations:

  • create — creates a new TSDB table or stream ("backend data").
  • delete — deletes a table or stream or specific table items.
  • read — reads data from a table or stream into pandas DataFrames.
  • write — writes data from pandas DataFrames to a table or stream.
  • execute — executes a backend-specific command on a table or stream. Each backend may support multiple commands.

Note: Some methods or method parameters are backend-specific, as detailed in this reference.

User Authentication

When creating a Frames client, you must provide valid platform credentials for accessing the backend data, which Frames will use to identify the identity of the user. This can be done by using any of the following alternative methods (documented in order of precedence):

  • Provide the authentication credentials in the Client constructor parameters by using either of the following methods:

    • Set the token constructor parameter to a valid platform access key with the required data-access permissions. You can get the access key from the Access Keys window that's available from the user-profile menu of the platform dashboard, or by copying the value of the V3IO_ACCESS_KEY environment variable in a platform web-shell or Jupyter Notebook service.
    • Set the user and password constructor parameters to the username and password of a platform user with the required data-access permissions.

    Note: You cannot use both methods concurrently: setting both the token and user and password parameters in the same constructor call will produce an error.

  • Set the authentication credentials in environment variables, by using either of the following methods:

    • Set the V3IO_ACCESS_KEY environment variable to a valid platform access key with the required data-access permissions.

      Note: The platform's Jupyter Notebook service automatically defines the V3IO_ACCESS_KEY environment variable and initializes it to a valid access key for the running user of the service.

    • Set the V3IO_USERNAME and V3IO_PASSWORD environment variables to the username and password of a platform user with the required data-access permissions.

    Note:

    • When the client constructor is called with authentication parameters (option #1), the authentication-credentials environment variables (if defined) are ignored.
    • When V3IO_ACCESS_KEY is defined, V3IO_USERNAME and V3IO_PASSWORD are ignored.

Client Constructor

All Frames operations are executed via an object of the Client class.

Syntax
Client(address=""[, data_url=""], container=""[, user="", password="", token=""])

Parameters and Data Members
  • address — The address of the Frames service (framesd).
    When running locally on the platform (for example, from a Jupyter Notebook service), set this parameter to framesd:8081 to use the gRPC (recommended) or to framesd:8080 to use HTTP.
    When connecting to the platform remotely, set this parameter to the API address of a Frames platform service in the parent tenant. You can copy this address from the API column of the V3IO Frames service on the Services platform dashboard page.

    • Type: str
    • Requirement: Required
  • data_url — A web-API base URL for accessing the backend data. By default, the client uses the data URL that's configured for the Frames service, which is typically the HTTPS URL of the web-APIs service of the parent platform tenant.

    • Type: str
    • Requirement: Optional
  • container — The name of the platform data container that contains the backend data. For example, "bigdata" or "users".

    • Type: str
    • Requirement: Required
  • user — The username of a platform user with permissions to access the backend data. See User Authentication.

    • Type: str
    • Requirement: Required when neither the token parameter or the authentication environment variables are set.
      When the user parameter is set, the password parameter must also be set to a matching user password.
  • password — A platform password for the user configured in the user parameter. See User Authentication.

    • Type: str
    • Requirement: Required when the user parameter is set.
  • token — A valid platform access key that allows access to the backend data. See User Authentication.

    • Type: str
    • Requirement: Required when neither the user or password parameters or the authentication environment variables are set.

Return Value

Returns a new Frames Client data object.

Examples

The following examples, for local platform execution, both create a Frames client for accessing data in the "users" container by using the authentication credentials of user "iguazio"; the first example uses access-key authentication while the second example uses username and password authentication (see User Authentication):

import v3io_frames as v3f
client = v3f.Client("framesd:8081", token="e8bd4ca2-537b-4175-bf01-8c74963e90bf", container="users")
import v3io_frames as v3f
client = v3f.Client("framesd:8081", user="iguazio", password="mypass", container="users")

Common Client Method Parameters

All client methods receive the following common parameters; additional, method-specific parameters are described for each method.

  • backend — The backend data type for the operation. See Backend Types.

    • Type: str
    • Requirement: Required
    • Valid Values: "kv" | "stream" | "tsdb" | "csv" (for testing)
  • table — The relative path to the backend data — a directory in the target platform data container (as configured for the client object) that represents a TSDB or NoSQL table or a data stream. For example, "mytable" or "examples/tsdb/my_metrics".

    • Type: str
    • Requirement: Required unless otherwise specified in the method-specific documentation

create Method

Creates a new TSDB table or stream in a platform data container, according to the specified backend type.

The create method is supported by the tsdb and stream backends, but not by the kv backend, because NoSQL tables in the platform don't need to be created prior to ingestion; when ingesting data into a table that doesn't exist, the table is automatically created.

Syntax
create(backend, table, attrs=None, schema=None, if_exists=FAIL)

Common create Parameters

All Frames backends that support the create method support the following common parameters:

  • attrs — A dictionary of <argument name>: <value> pairs for passing additional backend-specific parameters (arguments).

tsdb Backend create Parameters

The following create parameters are specific to the tsdb backend and are passed via the method's attrs parameter; for more information about these parameters, see the V3IO TSDB documentation:

  • rate — The ingestion rate of the TSDB metric samples. It's recommended that you set the rate to the average expected ingestion rate, and that the ingestion rates for a given TSDB table don't vary significantly; when there's a big difference in the ingestion rates (for example, x10), use separate TSDB tables.

    • Type: str
    • Requirement: Required
    • Valid Values: A string of the format "[0-9]+/[smh]" — where 's' = seconds, 'm' = minutes, and 'h' = hours. For example, "1/s" (one sample per minute), "20/m" (20 samples per minute), or "50/h" (50 samples per hour).
  • aggregates — A list of aggregation functions for executing in real time during the samples ingestion ("pre-aggregation").

    • Type: str
    • Requirement: Optional
    • Valid Values: A string containing a comma-separated list of supported aggregation functions — avg| count| last| max| min| rate| stddev| stdvar| sum. For example, "count,avg,min,max".
  • aggregation-granularity — Aggregation granularity; i.e., a time interval for applying pre-aggregation functions, if configured in the aggregates parameter.

    • Type: str
    • Requirement: Optional
    • Valid Values: A string of the format "[0-9]+[mhd]" — where 'm' = minutes, 'h' = hours, and 'd' = days. For example, "30m" (30 minutes), "2h" (2 hours), or "1d" (1 day).
    • Default Value: "1h" (1 hour)

stream Backend create Parameters

The following create parameters are specific to the stream backend and are passed via the method's attrs parameter; for more information about these parameters, see the platform's streams documentation:

  • shards — The number of stream shards to create.

    • Type: int
    • Requirement: Optional
    • Default Value: 1
    • Valid Values: A positive integer (>= 1). For example, 100.
  • retention_hours — The stream's retention period, in hours.

    • Type: int
    • Requirement: Optional
    • Default Value: 24
    • Valid Values: A positive integer (>= 1). For example, 2 (2 hours).

create Examples

tsdb Backend
  • Create a TSDB table named "mytable" in the root directory of the client's data container with an ingestion rate of 10 samples per minute:

    client.create("tsdb", "/mytable", attrs={"rate": "10/m"})
    
  • Create a TSDB table named "my_metrics" in a tsdb directory in the client's data container with an ingestion rate of 1 sample per second. The table is created with the count, avg, min, and max aggregates and an aggregation granularity of 1 hour:

    client.create("tsdb", "/tsdb/my_metrics", attrs={"rate": "1/s", "aggregates": "count,avg,min,max", "aggregation-granularity": "1h"})
    

stream Backend
  • Create a stream named "mystream" in the root directory of the client's data container. The stream has 6 shards and a retention period of 1 hour (default):

    client.create("stream", "/mystream", attrs={"shards": 6})
    
  • Create a stream named "stream1" in a "my_streams" directory in the client's data container. The stream has 24 shards (default) and a retention period of 2 hours:

    client.create("stream", "my_streams/stream1", attrs={"retention_hours": 2})
    

write Method

Writes data from a DataFrame to a table or stream in a platform data container, according to the specified backend type.

Syntax
write(backend, table, dfs, expression='', condition='', labels=None,
    max_in_message=0, index_cols=None, partition_keys=None)

Note: The expression and partition_keys parameters aren't supported in the current release.

Common write Parameters

All Frames backends that support the write method support the following common parameters:

  • dfs (Required) — A single DataFrame, a list of DataFrames, or a DataFrames iterator — One or more DataFrames containing the data to write. (See the tsdb backend-specific parameters.)
  • index_cols (Optional) (default: None) — []str — A list of column (attribute) names to be used as index columns for the write operation, regardless of any index-column definitions in the DataFrame. By default, the DataFrame's index columns are used.

    Note: The significance and supported number of index columns is backend specific. For example, the kv backend supports only a single index column for the primary-key item attribute, while the tsdb backend supports additional index columns for metric labels.

  • labels (Optional) — This parameter is currently applicable only to the tsdb backend (although it's available for all backends) and is therefore documented as part of the write method's tsdb backend parameters.
  • max_in_message (Optional) (default: 0)

kv Backend write Parameters

The following write parameters are specific to the kv backend; for more information about these parameters, see the platform's NoSQL documentation:

  • condition (Optional) (default: None) — A platform condition expression that defines conditions for performing the write operation. For detailed information about platform condition expressions, see the platform documentation.

tsdb Backend write Parameters

The following write parameter descriptions are specific to the tsdb backend; for more information about these parameters, see the V3IO TSDB documentation:

  • dfs (Required) — A single DataFrame, a list of DataFrames, or a DataFrames iterator — One or more DataFrames containing the data to write. This is a common write parameter, but the following information is specific to the tsdb backend:

    • You must define one or more non-index DataFrame columns that represent the sample metrics; the name of the column is the metric name and its values is the sample data (i.e., the ingested metric).
    • You must define a single index column whose value is the sample time of the data. This column serves as the table's primary-key attribute. Note that a TSDB DataFrame cannot have more than one index column of a time data type.
    • You can optionally define string index columns that represent metric labels for the current DataFrame row. Note that you can also define labels for all DataFrame rows by using the labels parameter (in addition or instead of using column indexes to apply labels to a specific row).
  • labels (Optional) (default: None) — dict — A dictionary of metric labels of the format {<label>: <value>[, <label>: <value>, ...]}, which will be applied to all the DataFrame rows (i.e., to all the ingested metric samples). For example, {"os": "linux", "arch": "x86"}. Note that you can also define labels for a specific DataFrame row by adding a string index column to the row (in addition or instead of using the labels parameter to define labels for all rows), as explained in the description of the dfs parameter.

write Examples

kv Backend
data = [["tom", 10, "TLV"], ["nick", 15, "Berlin"], ["juli", 14, "NY"]]
df = pd.DataFrame(data, columns = ["name", "age", "city"])
df.set_index("name", inplace=True)
client.write(backend="kv", table="mytable", dfs=df, condition="age>14")

read Method

Reads data from a table or stream in a platform data container to a DataFrame, according to the configured backend.

Reads data from a backend.

Syntax
read(backend='', table='', query='', columns=None, filter='', group_by='',
    limit=0, data_format='', row_layout=False, max_in_message=0, marker='',
    iterator=False, **kw)

Note: The limit, data_format, row_layout, and marker parameters aren't supported in the current release.

Common read Parameters

All Frames backends that support the read method support the following common parameters:

  • iterator — (Optional) (default: False) — boolTrue to return a DataFrames iterator; False to return a single DataFrame.

  • filter (Optional) — str — A query filter. For example, filter="col1=='my_value'".
    This parameter cannot be used concurrently with the query parameter of the tsdb backend.

  • columns[]str — A list of attributes (columns) to return.
    This parameter cannot be used concurrently with the query parameter of the tsdb backend.

kv Backend read Parameters

The following read parameters are specific to the kv backend; for more information about these parameters, see the platform's NoSQL documentation:

  • max_in_messageint — The maximum number of rows per message.

The following parameters are passed as keyword arguments via the kw parameter:

  • reset_indexbool — Determines whether to reset the index index column of the returned DataFrame: True — reset the index column by setting it to the auto-generated pandas range-index column; False (default) — set the index column to the table's primary-key attribute.

  • sharding_keys[]string — A list of specific sharding keys to query, for range-scan formatted tables only.

tsdb Backend read Parameters

The following read parameters are specific to the tsdb backend; for more information about these parameters, see the V3IO TSDB documentation:

  • group_by (Optional) — str — A group-by query string.
    This parameter cannot be used concurrently with the query parameter.

  • query (Optional) — str — A query string in SQL format.

    Note:

    • When setting the query parameter, you must provide the path to the TSDB table as part of the FROM caluse in the query string and not in the read method's table parameter.
    • This parameter cannot be set concurrently with the following parameters: aggregators, columns, filter, or group_by parameters.

The following parameters are passed as keyword arguments via the kw parameter:

  • startstr — Start (minimum) time for the read operation, as a string containing an RFC 3339 time, a Unix timestamp in milliseconds, a relative time of the format "now" or "now-[0-9]+[mhd]" (where m = minutes, h = hours, and 'd' = days), or 0 for the earliest time. For example: "2016-01-02T15:34:26Z"; "1451748866"; "now-90m"; "0".
    The default start time is <end time> - 1h.

  • endstr — End (maximum) time for the read operation, as a string containing an RFC 3339 time, a Unix timestamp in milliseconds, a relative time of the format "now" or "now-[0-9]+[mhd]" (where m = minutes, h = hours, and 'd' = days), or 0 for the earliest time. For example: "2018-09-26T14:10:20Z"; "1537971006000"; "now-3h"; "now-7d".
    The default end time is "now".

  • step (Optional) — str — The query step (interval), which determines the points over the query's time range at which to perform aggregations (for an aggregation query) or downsample the data (for a query without aggregators). The default step is the query's time range, which can be configured via the start and end parameters.

  • aggregators (Optional) — str — Aggregation information to return, as a comma-separated list of supported aggregation functions ("aggregators"). The following aggregation functions are supported for over-time aggregation (across each unique label set); for cross-series aggregation (across all metric labels), add "_all" to the end of the function name:
    avg | count | last | max | min | rate | stddev | stdvar | sum
    This parameter cannot be used concurrently with the query parameter.

  • aggregationWindow (Optional) — str — Aggregation interval for applying over-time aggregation functions, if set in the aggregators or query parameters, as a string of the format "[0-9]+[mhd]" where 'm' = minutes, 'h' = hours, and 'd' = days. The default aggregation window is the query's aggregation step. When using the default aggregation window, the aggregation window starts at the aggregation step; when the aggregationWindow parameter is set, the aggregation window ends at the aggregation step.

  • multi_index (Optional) — boolTrue to receive the read results as multi-index DataFrames where the labels are used as index columns in addition to the metric sample-time primary-key attribute; False (default) only the timestamp will function as the index.

stream Backend read Parameters

The following read parameters are specific to the stream backend and are passed as keyword arguments via the kw parameter; for more information about these parameters, see the platform's streams documentation:

  • seekstr (Required) — Seek type. Valid values: "time" | "seq" | "sequence" | "latest" | "earliest".
    When the "seq" or "sequence" seek type is set, you must set the sequence parameter to the desired record sequence number.
    When the time seek type is set, you must set the start parameter to the desired seek start time.

  • shard_idstr (Required) The ID of the stream shard from which to read. Valid values: "0" ... "<stream shard count> - 1".

  • sequenceint64 (Required when seek = "sequence") — The sequence number of the record from which to start reading.

  • startstr (Required when seek = "time") — The earliest record ingestion time from which to start reading, as a string containing an RFC 3339 time, a Unix timestamp in milliseconds, a relative time of the format "now" or "now-[0-9]+[mhd]" (where m = minutes, h = hours, and 'd' = days), or 0 for the earliest time. For example: "2016-01-02T15:34:26Z"; "1451748866"; "now-90m"; "0".

Return Value
  • When the value of the iterator parameter is False (default) — returns a single DataFrame.
  • When the value of the iterator parameter is True — returns a DataFrames iterator.

read Examples

kv Backend
df = client.read(backend="kv", table="mytable", filter="col1>666")

tsdb Backend
df = client.read(backend="tsdb", query="select avg(cpu) as cpu, avg(diskio), avg(network) from mytable where os='win'", start="now-1d", end="now", step="2h")

stream Backend
df = client.read(backend="stream", table="mytable", seek="latest", shard_id="5")

delete Method

Deletes a table or stream or specific table items from a platform data container, according to the specified backend type.

Syntax
delete(backend, table, filter='', start='', end='', if_missing=FAIL

kv Backend delete Parameters
  • filter — A platform filter expression that identifies specific items to delete. For detailed information about platform filter expressions, see the platform documentation.

    Note: When the filter parameter isn't set, the entire table and its schema file (.#schema) are deleted.

    • Type: str
    • Requirement: Optional
    • Default Value: ""

tsdb Backend delete Parameters

The following delete parameters are specific to the tsdb backend; for more information about these parameters, see the V3IO TSDB documentation:

  • start — Start (minimum) time for the delete operation — i.e., delete only items whose data sample time is at or after (>=) the specified start time.

    • Type: str
    • Requirement: Optional
    • Valid Values: A string containing an RFC 3339 time, a Unix timestamp in milliseconds, a relative time of the format "now" or "now-[0-9]+[mhd]" (where m = minutes, h = hours, and 'd' = days), or 0 for the earliest time. For example: "2016-01-02T15:34:26Z"; "1451748866"; "now-90m"; "0".
    • Default Value: "" when neither start nor end are set — to delete the entire table and its schema file (.schema) — and 0 when end is set
  • endstr — End (maximum) time for the delete operation — i.e., delete only items whose data sample time is before or at (<=) the specified end time.

    • Type: str
    • Requirement: Optional
    • Valid Values: A string containing an RFC 3339 time, a Unix timestamp in milliseconds, a relative time of the format "now" or "now-[0-9]+[mhd]" (where m = minutes, h = hours, and 'd' = days), or 0 for the earliest time. For example: "2018-09-26T14:10:20Z"; "1537971006000"; "now-3h"; "now-7d".
    • Default Value: "" when neither start nor end are set — to delete the entire table and its schema file (.schema) — and 0 when start is set

Note:

  • When neither the start nor end parameters are set, the entire TSDB table and its schema file are deleted.
  • Only full table partitions within the specified time frame (as determined by the start and end parameters) are deleted. Items within the specified time frames that reside within partitions that begin before the delete start time or end after the delete end time aren't deleted. The partition interval is calculated automatically based on the table's ingestion rate and is stored in the TSDB's partitionerInterval schema field (see the .schema file).

delete Examples

kv Backend
client.delete(backend="kv", table="mytable", filter="age > 40")

tsdb Backend
client.delete(backend="tsdb", table="mytable", start="now-1d", end="now-5h")

stream Backend
client.delete(backend="stream", table="mystream")

execute Method

Extends the basic CRUD functionality of the other client methods via backend-specific commands.

Note: Currently, no execute commands are available for the tsdb backend.

Syntax
execute(backend, table, command="", args=None)

Common execute Parameters

All Frames backends that support the execute method support the following common parameters:

  • args — A dictionary of <argument name>: <value> pairs for passing command-specific parameters (arguments).

    • Type: dict
    • Requirement: Optional
    • Default Value: None

kv Backend execute Commands
  • infer | inferschema — Infers the data schema of a given NoSQL table and creates a schema file for the table.

    Example:

    client.execute(backend="kv", table="mytable", command="infer")
    

stream Backend execute Commands
  • put — Adds records to a stream.

    Example:

    client.execute(backend="stream", table="mystream", command="put", args={"data": "this a record", "clientinfo": "some_info", "partition": "partition_key"})
    

Contributing

To contribute to V3IO Frames, you need to be aware of the following:

Components

The following components are required for building Frames code:

  • Go server with support for both the gRPC and HTTP protocols
  • Go client
  • Python client

Development

The core is written in Go. The development is done on the development branch and then released to the master branch.

Before submitting changes, test the code:

  • To execute the Go tests, run make test.
  • To execute the Python tests, run make test-python.

Adding and Changing Dependencies
  • If you add Go dependencies, run make update-go-deps.
  • If you add Python dependencies, update clients/py/Pipfile and run make update-py-deps.

Travis CI

Integration tests are run on Travis CI. See .travis.yml for details.

The following environment variables are defined in the Travis settings:

  • Docker Container Registry (Quay.io)
    • DOCKER_PASSWORD — a password for pushing images to Quay.io.
    • DOCKER_USERNAME — a username for pushing images to Quay.io.
  • Python Package Index (PyPI)
    • V3IO_PYPI_PASSWORD — a password for pushing a new release to PyPi.
    • V3IO_PYPI_USER — a username for pushing a new release to PyPi.
  • Iguazio Data Science Platform
    • V3IO_SESSION — a JSON encoded map with session information for running tests. For example:

      '{"url":"45.39.128.5:8081","container":"mitzi","user":"daffy","password":"rabbit season"}'
      

      Note: Make sure to embed the JSON object within single quotes ('{...}').

Docker Image

Building the Image

Use the following command to build the Docker image:

make build-docker

Running the Image

Use the following command to run the Docker image:

docker run \
	-v /path/to/config.yaml:/etc/framesd.yaml \
	quay.io/v3io/frames:unstable

LICENSE

Apache 2

Documentation

Overview

Package frames provides an efficient way of moving data from various sources.

The package is composed os a HTTP web server that can serve data from various sources and from clients in Go and in Python.

Index

Constants

View Source
const (
	IgnoreError = pb.ErrorOptions_IGNORE
	FailOnError = pb.ErrorOptions_FAIL
)

Shortcut for fail/ignore

Variables

View Source
var (
	BoolType   = DType(pb.DType_BOOLEAN)
	FloatType  = DType(pb.DType_FLOAT)
	IntType    = DType(pb.DType_INTEGER)
	StringType = DType(pb.DType_STRING)
	TimeType   = DType(pb.DType_TIME)
)

Possible data types

View Source
var (
	// DefaultLogLevel is the default log verbosity
	DefaultLogLevel string
)
View Source
var ZeroTime time.Time

ZeroTime is zero value for time

Functions

func InitBackendDefaults added in v0.3.0

func InitBackendDefaults(cfg *BackendConfig, framesConfig *Config)

InitBackendDefaults initializes default configuration for backend

func MarshalFrame

func MarshalFrame(frame Frame) ([]byte, error)

MarshalFrame serializes a frame to []byte

func NewLogger

func NewLogger(verbose string) (logger.Logger, error)

NewLogger returns a new logger

func SessionFromEnv

func SessionFromEnv() (*pb.Session, error)

SessionFromEnv return a session from V3IO_SESSION environment variable (JSON encoded)

Types

type BackendConfig

type BackendConfig struct {
	Type    string `json:"type"` // v3io, csv, ...
	Name    string `json:"name"`
	Workers int    `json:"workers"`
	// backend specific options
	Options map[string]interface{} `json:"options"`

	// CSV backend
	RootDir string `json:"rootdir,omitempty"`
}

BackendConfig is default backend configuration

type Client

type Client interface {
	// Read reads data from server
	Read(request *pb.ReadRequest) (FrameIterator, error)
	// Write writes data to server
	Write(request *WriteRequest) (FrameAppender, error)
	// Create creates a table
	Create(request *pb.CreateRequest) error
	// Delete deletes data or table
	Delete(request *pb.DeleteRequest) error
	// Exec executes a command on the backend
	Exec(request *pb.ExecRequest) (Frame, error)
}

Client interface

type Column

type Column interface {
	Len() int                                 // Number of elements
	Name() string                             // Column name
	DType() DType                             // Data type (e.g. IntType, FloatType ...)
	Ints() ([]int64, error)                   // Data as []int64
	IntAt(i int) (int64, error)               // Int value at index i
	Floats() ([]float64, error)               // Data as []float64
	FloatAt(i int) (float64, error)           // Float value at index i
	Strings() []string                        // Data as []string
	StringAt(i int) (string, error)           // String value at index i
	Times() ([]time.Time, error)              // Data as []time.Time
	TimeAt(i int) (time.Time, error)          // time.Time value at index i
	Bools() ([]bool, error)                   // Data as []bool
	BoolAt(i int) (bool, error)               // bool value at index i
	Slice(start int, end int) (Column, error) // Slice of data
	CopyWithName(newName string) Column       // Create a copy of the current column
}

Column is a data column

func NewLabelColumn

func NewLabelColumn(name string, value interface{}, size int) (Column, error)

NewLabelColumn returns a new slabel column

func NewSliceColumn

func NewSliceColumn(name string, data interface{}) (Column, error)

NewSliceColumn returns a new slice column

type ColumnBuilder

type ColumnBuilder interface {
	Append(value interface{}) error
	At(index int) (interface{}, error)
	Set(index int, value interface{}) error
	Delete(index int) error
	Finish() Column
}

ColumnBuilder is interface for building columns

func NewLabelColumnBuilder

func NewLabelColumnBuilder(name string, dtype DType, size int) ColumnBuilder

NewLabelColumnBuilder return a builder for LabelColumn

func NewSliceColumnBuilder

func NewSliceColumnBuilder(name string, dtype DType, size int) ColumnBuilder

NewSliceColumnBuilder return a builder for SliceColumn

type Config

type Config struct {
	Log            LogConfig `json:"log"`
	DefaultLimit   int       `json:"limit,omitempty"`
	DefaultTimeout int       `json:"timeout,omitempty"`

	// default V3IO connection details
	WebAPIEndpoint string `json:"webApiEndpoint"`
	Container      string `json:"container"`
	Username       string `json:"username,omitempty"`
	Password       string `json:"password,omitempty"`
	SessionKey     string `json:"sessionKey,omitempty"`

	// Number of parallel V3IO worker routines
	Workers int `json:"workers"`

	QuerierCacheSize int `json:"querierCacheSize"`

	Backends []*BackendConfig `json:"backends,omitempty"`

	DisableProfiling bool `json:"disableProfiling,omitempty"`
}

Config is server configuration

func (*Config) InitDefaults

func (c *Config) InitDefaults() error

InitDefaults initializes the defaults for configuration

func (*Config) Validate

func (c *Config) Validate() error

Validate validates the configuration

type CreateRequest

type CreateRequest struct {
	Proto    *pb.CreateRequest
	Password SecretString
	Token    SecretString
}

CreateRequest is a table creation request

type DType

type DType pb.DType

DType is data type

type DataBackend

type DataBackend interface {
	// TODO: Expose name, type, config ... ?
	Read(request *ReadRequest) (FrameIterator, error)
	Write(request *WriteRequest) (FrameAppender, error) // TODO: use Appender for write streaming
	Create(request *CreateRequest) error
	Delete(request *DeleteRequest) error
	Exec(request *ExecRequest) (Frame, error)
}

DataBackend is an interface for read/write on backend

type Decoder

type Decoder struct {
	// contains filtered or unexported fields
}

Decoder is message decoder

func NewDecoder

func NewDecoder(r io.Reader) *Decoder

NewDecoder returns a new Decoder

func (*Decoder) Decode

func (d *Decoder) Decode(msg proto.Message) error

Decode decodes message from d.r

type DeleteRequest

type DeleteRequest struct {
	Proto    *pb.DeleteRequest
	Password SecretString
	Token    SecretString
}

DeleteRequest is a deletion request

type Encoder

type Encoder struct {
	// contains filtered or unexported fields
}

Encoder is message encoder

func NewEncoder

func NewEncoder(w io.Writer) *Encoder

NewEncoder returns new Encoder

func (*Encoder) Encode

func (e *Encoder) Encode(msg proto.Message) error

Encode encoders the message to e.w

type ExecRequest

type ExecRequest struct {
	Proto    *pb.ExecRequest
	Password SecretString
	Token    SecretString
}

ExecRequest is execution request

type Frame

type Frame interface {
	Labels() map[string]interface{}          // Label set
	Names() []string                         // Column names
	Indices() []Column                       // Index columns
	Len() int                                // Number of rows
	Column(name string) (Column, error)      // Column by name
	Slice(start int, end int) (Frame, error) // Slice of Frame
	IterRows(includeIndex bool) RowIterator  // Iterate over rows
	IsNull(index int, colName string) bool
}

Frame is a collection of columns

func NewFrame

func NewFrame(columns []Column, indices []Column, labels map[string]interface{}) (Frame, error)

NewFrame returns a new Frame

func NewFrameFromMap

func NewFrameFromMap(columns map[string]interface{}, indices map[string]interface{}) (Frame, error)

NewFrameFromMap returns a new MapFrame from a map

func NewFrameFromProto

func NewFrameFromProto(msg *pb.Frame) Frame

NewFrameFromProto return a new frame from protobuf message

func NewFrameFromRows

func NewFrameFromRows(rows []map[string]interface{}, indices []string, labels map[string]interface{}) (Frame, error)

NewFrameFromRows creates a new frame from rows

func NewFrameWithNullValues

func NewFrameWithNullValues(columns []Column, indices []Column, labels map[string]interface{}, nullValues []*pb.NullValuesMap) (Frame, error)

func UnmarshalFrame

func UnmarshalFrame(data []byte) (Frame, error)

UnmarshalFrame de-serialize a frame from []byte

type FrameAppender

type FrameAppender interface {
	Add(frame Frame) error
	WaitForComplete(timeout time.Duration) error
}

FrameAppender appends frames

type FrameIterator

type FrameIterator interface {
	Next() bool
	Err() error
	At() Frame
}

FrameIterator iterates over frames

type JoinStruct

type JoinStruct = pb.JoinStruct

JoinStruct is a join structure

type LogConfig

type LogConfig struct {
	Level string `json:"level,omitempty"`
}

LogConfig is the logging configuration

type Query

type Query struct {
	Table   string
	Columns []string
	Filter  string
	GroupBy string
}

Query is query structure

func ParseSQL

func ParseSQL(sql string) (*Query, error)

ParseSQL parsers SQL query to a Query struct

type ReadRequest

type ReadRequest struct {
	Proto    *pb.ReadRequest
	Password SecretString
	Token    SecretString
}

ReadRequest is a read/query request

type RowIterator

type RowIterator interface {
	Next() bool                      // Advance to next row
	Row() map[string]interface{}     // Row as map of name->value
	RowNum() int                     // Current row number
	Indices() map[string]interface{} // MultiIndex as name->value
	Err() error                      // Iteration error
}

RowIterator is an iterator over frame rows

type SaveMode

type SaveMode int
const (
	ErrorIfTableExists SaveMode = iota
	OverwriteTable
	UpdateItem
	OverwriteItem
	CreateNewItemsOnly
)

func SaveModeFromString

func SaveModeFromString(mode string) (SaveMode, error)

func (SaveMode) GetNginxModeName

func (mode SaveMode) GetNginxModeName() string

func (SaveMode) String

func (mode SaveMode) String() string

type SchemaField

type SchemaField = pb.SchemaField

SchemaField represents a schema field for Avro record.

type SchemaKey

type SchemaKey = pb.SchemaKey

SchemaKey is a schema key

type SecretString

type SecretString struct {
	// contains filtered or unexported fields
}

Hides a string such as a password from both plain and json logs.

func InitSecretString

func InitSecretString(s string) SecretString

func (SecretString) Get

func (s SecretString) Get() string

type Server

type Server interface {
	Start() error
	State() ServerState
	Err() error
}

Server is frames server interface

type ServerBase

type ServerBase struct {
	// contains filtered or unexported fields
}

ServerBase have common functionality for server

func NewServerBase

func NewServerBase() *ServerBase

NewServerBase returns a new server base

func (*ServerBase) Err

func (s *ServerBase) Err() error

Err returns the server error

func (*ServerBase) SetError

func (s *ServerBase) SetError(err error)

SetError sets current error and will change state to ErrorState

func (*ServerBase) SetState

func (s *ServerBase) SetState(state ServerState)

SetState sets the server state

func (*ServerBase) State

func (s *ServerBase) State() ServerState

State return the server state

type ServerState

type ServerState string

ServerState is state of server

const (
	ReadyState   ServerState = "ready"
	RunningState ServerState = "running"
	ErrorState   ServerState = "error"
)

Possible server states

type Session

type Session = pb.Session

Session information

func InitSessionDefaults

func InitSessionDefaults(session *Session, framesConfig *Config) *Session

InitSessionDefaults initializes session defaults

func NewSession

func NewSession(url, container, path, user, password, token, id string) (*Session, error)

NewSession will create a new session. It will populate missing values from the V3IO_SESSION environment variable (JSON encoded)

type TableSchema

type TableSchema = pb.TableSchema

TableSchema is a table schema

type WriteRequest

type WriteRequest struct {
	Session  *Session
	Password SecretString
	Token    SecretString
	Backend  string // backend name
	Table    string // Table name (path)
	// Data message sent with the write request (in case of a stream multiple messages can follow)
	ImmidiateData Frame
	// Expression template, for update expressions generated from combining columns data with expression
	Expression string
	// Condition template, for update conditions generated from combining columns data with expression
	Condition string
	// Will we get more message chunks (in a stream), if not we can complete
	HaveMore bool
	SaveMode SaveMode
}

WriteRequest is request for writing data TODO: Unite with probouf (currenly the protobuf message combines both this and a frame message)

Directories

Path Synopsis
csv
kv
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL