Documentation ¶
Overview ¶
Package arrow provides an implementation of Apache Arrow.
Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and inter-process communication.
Basics ¶
The fundamental data structure in Arrow is an Array, which holds a sequence of values of the same type. An array consists of memory holding the data and an additional validity bitmap that indicates if the corresponding entry in the array is valid (not null). If the array has no null entries, it is possible to omit this bitmap.
Example (FromMemory) ¶
This example demonstrates creating an array, sourcing the values and null bitmaps directly from byte slices. The null count is set to UnknownNullCount, instructing the array to calculate the null count from the bitmap when NullN is called.
package main import ( "fmt" "github.com/apache/arrow/go/arrow/array" "github.com/apache/arrow/go/arrow/memory" ) func main() { // create LSB packed bits with the following pattern: // 01010011 11000101 data := memory.NewBufferBytes([]byte{0xca, 0xa3}) // create LSB packed validity (null) bitmap, where every 4th element is null: // 11101110 11101110 nullBitmap := memory.NewBufferBytes([]byte{0x77, 0x77}) // Create a boolean array and lazily determine NullN using UnknownNullCount bools := array.NewBoolean(16, data, nullBitmap, array.UnknownNullCount) // Show the null count fmt.Printf("NullN() = %d\n", bools.NullN()) // Enumerate the values. n := bools.Len() for i := 0; i < n; i++ { fmt.Printf("bools[%d] = ", i) if bools.IsNull(i) { fmt.Println("(null)") } else { fmt.Printf("%t\n", bools.Value(i)) } } }
Output: NullN() = 4 bools[0] = false bools[1] = true bools[2] = false bools[3] = (null) bools[4] = false bools[5] = false bools[6] = true bools[7] = (null) bools[8] = true bools[9] = true bools[10] = false bools[11] = (null) bools[12] = false bools[13] = true bools[14] = false bools[15] = (null)
Example (Minimal) ¶
This example demonstrates how to build an array of int64 values using a builder and Append. Whilst convenient for small arrays,
package main import ( "fmt" "github.com/apache/arrow/go/arrow/array" "github.com/apache/arrow/go/arrow/memory" ) func main() { // Create an allocator. pool := memory.NewGoAllocator() // Create an int64 array builder. builder := array.NewInt64Builder(pool) builder.Append(1) builder.Append(2) builder.Append(3) builder.AppendNull() builder.Append(5) builder.Append(6) builder.Append(7) builder.Append(8) // Finish building the int64 array and reset the builder. ints := builder.NewInt64Array() // Enumerate the values. for i, v := range ints.Int64Values() { fmt.Printf("ints[%d] = ", i) if ints.IsNull(i) { fmt.Println("(null)") } else { fmt.Println(v) } } }
Output: ints[0] = 1 ints[1] = 2 ints[2] = 3 ints[3] = (null) ints[4] = 5 ints[5] = 6 ints[6] = 7 ints[7] = 8
Index ¶
- Constants
- Variables
- type BinaryDataType
- type BinaryType
- type BooleanType
- type DataType
- type FixedWidthDataType
- type Float32Type
- type Float64Type
- type Int16Type
- type Int32Type
- type Int64Type
- type Int8Type
- type StringType
- type TimeUnit
- type Timestamp
- type TimestampType
- type Type
- type Uint16Type
- type Uint32Type
- type Uint64Type
- type Uint8Type
Examples ¶
Constants ¶
const ( // Float32SizeBytes specifies the number of bytes required to store a single float32 in memory Float32SizeBytes = int(unsafe.Sizeof(float32(0))) )
const ( // Float64SizeBytes specifies the number of bytes required to store a single float64 in memory Float64SizeBytes = int(unsafe.Sizeof(float64(0))) )
const ( // Int16SizeBytes specifies the number of bytes required to store a single int16 in memory Int16SizeBytes = int(unsafe.Sizeof(int16(0))) )
const ( // Int32SizeBytes specifies the number of bytes required to store a single int32 in memory Int32SizeBytes = int(unsafe.Sizeof(int32(0))) )
const ( // Int64SizeBytes specifies the number of bytes required to store a single int64 in memory Int64SizeBytes = int(unsafe.Sizeof(int64(0))) )
const ( // Int8SizeBytes specifies the number of bytes required to store a single int8 in memory Int8SizeBytes = int(unsafe.Sizeof(int8(0))) )
const ( // TimestampSizeBytes specifies the number of bytes required to store a single Timestamp in memory TimestampSizeBytes = int(unsafe.Sizeof(Timestamp(0))) )
const ( // Uint16SizeBytes specifies the number of bytes required to store a single uint16 in memory Uint16SizeBytes = int(unsafe.Sizeof(uint16(0))) )
const ( // Uint32SizeBytes specifies the number of bytes required to store a single uint32 in memory Uint32SizeBytes = int(unsafe.Sizeof(uint32(0))) )
const ( // Uint64SizeBytes specifies the number of bytes required to store a single uint64 in memory Uint64SizeBytes = int(unsafe.Sizeof(uint64(0))) )
const ( // Uint8SizeBytes specifies the number of bytes required to store a single uint8 in memory Uint8SizeBytes = int(unsafe.Sizeof(uint8(0))) )
Variables ¶
var ( Int64Traits int64Traits Uint64Traits uint64Traits Float64Traits float64Traits Int32Traits int32Traits Uint32Traits uint32Traits Float32Traits float32Traits Int16Traits int16Traits Uint16Traits uint16Traits Int8Traits int8Traits Uint8Traits uint8Traits TimestampTraits timestampTraits )
var ( BinaryTypes = struct { Binary BinaryDataType String BinaryDataType }{ Binary: &BinaryType{}, String: &StringType{}, } )
var BooleanTraits booleanTraits
var ( FixedWidthTypes = struct { Boolean FixedWidthDataType }{ Boolean: &BooleanType{}, } )
var ( PrimitiveTypes = struct { Int8 DataType Int16 DataType Int32 DataType Int64 DataType Uint8 DataType Uint16 DataType Uint32 DataType Uint64 DataType Float32 DataType Float64 DataType }{ Int8: &Int8Type{}, Int16: &Int16Type{}, Int32: &Int32Type{}, Int64: &Int64Type{}, Uint8: &Uint8Type{}, Uint16: &Uint16Type{}, Uint32: &Uint32Type{}, Uint64: &Uint64Type{}, Float32: &Float32Type{}, Float64: &Float64Type{}, } )
Functions ¶
This section is empty.
Types ¶
type BinaryDataType ¶
type BinaryDataType interface { DataType // contains filtered or unexported methods }
type BinaryType ¶
type BinaryType struct{}
func (*BinaryType) ID ¶
func (t *BinaryType) ID() Type
func (*BinaryType) Name ¶
func (t *BinaryType) Name() string
type BooleanType ¶
type BooleanType struct{}
func (*BooleanType) BitWidth ¶
func (t *BooleanType) BitWidth() int
BitWidth returns the number of bits required to store a single element of this data type in memory.
func (*BooleanType) ID ¶
func (t *BooleanType) ID() Type
func (*BooleanType) Name ¶
func (t *BooleanType) Name() string
type FixedWidthDataType ¶
type FixedWidthDataType interface { DataType // BitWidth returns the number of bits required to store a single element of this data type in memory. BitWidth() int }
FixedWidthDataType is the representation of an Arrow type that requires a fixed number of bits in memory for each element.
type Float32Type ¶
type Float32Type struct{}
func (*Float32Type) ID ¶
func (t *Float32Type) ID() Type
func (*Float32Type) Name ¶
func (t *Float32Type) Name() string
type Float64Type ¶
type Float64Type struct{}
func (*Float64Type) ID ¶
func (t *Float64Type) ID() Type
func (*Float64Type) Name ¶
func (t *Float64Type) Name() string
type StringType ¶
type StringType struct{}
func (*StringType) ID ¶
func (t *StringType) ID() Type
func (*StringType) Name ¶
func (t *StringType) Name() string
type TimestampType ¶
TimestampType is encoded as a 64-bit signed integer since the UNIX epoch (2017-01-01T00:00:00Z). The zero-value is a nanosecond and time zone neutral. Time zone neutral can be considered UTC without having "UTC" as a time zone.
func (*TimestampType) BitWidth ¶
func (*TimestampType) BitWidth() int
BitWidth returns the number of bits required to store a single element of this data type in memory.
func (*TimestampType) ID ¶
func (*TimestampType) ID() Type
func (*TimestampType) Name ¶
func (*TimestampType) Name() string
type Type ¶
type Type int
Type is a logical type. They can be expressed as either a primitive physical type (bytes or bits of some fixed size), a nested type consisting of other data types, or another data type (e.g. a timestamp encoded as an int64)
const ( // NULL type having no physical storage NULL Type = iota // BOOL is a 1 bit, LSB bit-packed ordering BOOL // UINT8 is an Unsigned 8-bit little-endian integer UINT8 // INT8 is a Signed 8-bit little-endian integer INT8 // UINT16 is an Unsigned 16-bit little-endian integer UINT16 // INT16 is a Signed 16-bit little-endian integer INT16 // UINT32 is an Unsigned 32-bit little-endian integer UINT32 // INT32 is a Signed 32-bit little-endian integer INT32 // UINT64 is an Unsigned 64-bit little-endian integer UINT64 // INT64 is a Signed 64-bit little-endian integer INT64 // HALF_FLOAT is a 2-byte floating point value HALF_FLOAT // FLOAT32 is a 4-byte floating point value FLOAT32 // FLOAT64 is an 8-byte floating point value FLOAT64 // STRING is a UTF8 variable-length string STRING // BINARY is a Variable-length byte type (no guarantee of UTF8-ness) BINARY // FIXED_SIZE_BINARY is a binary where each value occupies the same number of bytes FIXED_SIZE_BINARY // DATE32 is int32 days since the UNIX epoch DATE32 // DATE64 is int64 milliseconds since the UNIX epoch DATE64 // TIMESTAMP is an exact timestamp encoded with int64 since UNIX epoch // Default unit millisecond TIMESTAMP // TIME32 is a signed 32-bit integer, representing either seconds or // milliseconds since midnight TIME32 // TIME64 is a signed 64-bit integer, representing either microseconds or // nanoseconds since midnight TIME64 // INTERVAL is YEAR_MONTH or DAY_TIME interval in SQL style INTERVAL // DECIMAL is a precision- and scale-based decimal type. Storage type depends on the // parameters. DECIMAL // LIST is a list of some logical data type LIST // STRUCT of logical types STRUCT // UNION of logical types UNION // DICTIONARY aka Category type DICTIONARY // MAP is a repeated struct logical type MAP )
type Uint16Type ¶
type Uint16Type struct{}
func (*Uint16Type) ID ¶
func (t *Uint16Type) ID() Type
func (*Uint16Type) Name ¶
func (t *Uint16Type) Name() string
type Uint32Type ¶
type Uint32Type struct{}
func (*Uint32Type) ID ¶
func (t *Uint32Type) ID() Type
func (*Uint32Type) Name ¶
func (t *Uint32Type) Name() string
type Uint64Type ¶
type Uint64Type struct{}
func (*Uint64Type) ID ¶
func (t *Uint64Type) ID() Type
func (*Uint64Type) Name ¶
func (t *Uint64Type) Name() string
Source Files ¶
Directories ¶
Path | Synopsis |
---|---|
_examples
|
|
_tools
|
|
Package array provides implementations of various Arrow array types.
|
Package array provides implementations of various Arrow array types. |
internal
|
|
cpu
Package cpu implements processor feature detection used by the Go standard library.
|
Package cpu implements processor feature detection used by the Go standard library. |
debug
Package debug provides APIs for conditional runtime assertions and debug logging.
|
Package debug provides APIs for conditional runtime assertions and debug logging. |
Package math provides optimized mathematical functions for processing Arrow arrays.
|
Package math provides optimized mathematical functions for processing Arrow arrays. |
Package memory provides support for allocating and manipulating memory at a low level.
|
Package memory provides support for allocating and manipulating memory at a low level. |