Documentation ¶
Overview ¶
Package arrow provides an implementation of Apache Arrow.
Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and inter-process communication.
Basics ¶
The fundamental data structure in Arrow is an Array, which holds a sequence of values of the same type. An array consists of memory holding the data and an additional validity bitmap that indicates if the corresponding entry in the array is valid (not null). If the array has no null entries, it is possible to omit this bitmap.
Example (FixedSizeListArray) ¶
This example shows how to create a FixedSizeList array. The resulting array should be:
[[0, 1, 2], (null), [3, 4, 5], [6, 7, 8], (null)]
pool := memory.NewGoAllocator() lb := array.NewFixedSizeListBuilder(pool, 3, arrow.PrimitiveTypes.Int64) defer lb.Release() vb := lb.ValueBuilder().(*array.Int64Builder) defer vb.Release() vb.Reserve(10) lb.Append(true) vb.Append(0) vb.Append(1) vb.Append(2) lb.AppendNull() vb.AppendValues([]int64{-1, -1, -1}, nil) lb.Append(true) vb.Append(3) vb.Append(4) vb.Append(5) lb.Append(true) vb.Append(6) vb.Append(7) vb.Append(8) lb.AppendNull() arr := lb.NewArray().(*array.FixedSizeList) defer arr.Release() fmt.Printf("NullN() = %d\n", arr.NullN()) fmt.Printf("Len() = %d\n", arr.Len()) fmt.Printf("Type() = %v\n", arr.DataType()) fmt.Printf("List = %v\n", arr)
Output: NullN() = 2 Len() = 5 Type() = fixed_size_list<item: int64>[3] List = [[0 1 2] (null) [3 4 5] [6 7 8] (null)]
Example (Float64Slice) ¶
This example shows how one can slice an array. The initial (float64) array is:
[1, 2, 3, (null), 4, 5]
and the sub-slice is:
[3, (null), 4]
package main import ( "fmt" "git.sr.ht/~sbinet/go-arrow/array" "git.sr.ht/~sbinet/go-arrow/memory" ) func main() { pool := memory.NewGoAllocator() b := array.NewFloat64Builder(pool) defer b.Release() b.AppendValues( []float64{1, 2, 3, -1, 4, 5}, []bool{true, true, true, false, true, true}, ) arr := b.NewFloat64Array() defer arr.Release() fmt.Printf("array = %v\n", arr) sli := array.NewSlice(arr, 2, 5).(*array.Float64) defer sli.Release() fmt.Printf("slice = %v\n", sli) }
Output: array = [1 2 3 (null) 4 5] slice = [3 (null) 4]
Example (Float64Tensor2x5) ¶
package main import ( "fmt" "git.sr.ht/~sbinet/go-arrow/array" "git.sr.ht/~sbinet/go-arrow/memory" "git.sr.ht/~sbinet/go-arrow/tensor" ) func main() { pool := memory.NewGoAllocator() b := array.NewFloat64Builder(pool) defer b.Release() raw := []float64{1, 2, 3, 4, 5, 6, 7, 8, 9, 10} b.AppendValues(raw, nil) arr := b.NewFloat64Array() defer arr.Release() f64 := tensor.NewFloat64(arr.Data(), []int64{2, 5}, nil, []string{"x", "y"}) defer f64.Release() for _, i := range [][]int64{ []int64{0, 0}, []int64{0, 1}, []int64{0, 2}, []int64{0, 3}, []int64{0, 4}, []int64{1, 0}, []int64{1, 1}, []int64{1, 2}, []int64{1, 3}, []int64{1, 4}, } { fmt.Printf("arr%v = %v\n", i, f64.Value(i)) } }
Output: arr[0 0] = 1 arr[0 1] = 2 arr[0 2] = 3 arr[0 3] = 4 arr[0 4] = 5 arr[1 0] = 6 arr[1 1] = 7 arr[1 2] = 8 arr[1 3] = 9 arr[1 4] = 10
Example (Float64Tensor2x5ColMajor) ¶
package main import ( "fmt" "git.sr.ht/~sbinet/go-arrow/array" "git.sr.ht/~sbinet/go-arrow/memory" "git.sr.ht/~sbinet/go-arrow/tensor" ) func main() { pool := memory.NewGoAllocator() b := array.NewFloat64Builder(pool) defer b.Release() raw := []float64{1, 2, 3, 4, 5, 6, 7, 8, 9, 10} b.AppendValues(raw, nil) arr := b.NewFloat64Array() defer arr.Release() f64 := tensor.NewFloat64(arr.Data(), []int64{2, 5}, []int64{8, 16}, []string{"x", "y"}) defer f64.Release() for _, i := range [][]int64{ []int64{0, 0}, []int64{0, 1}, []int64{0, 2}, []int64{0, 3}, []int64{0, 4}, []int64{1, 0}, []int64{1, 1}, []int64{1, 2}, []int64{1, 3}, []int64{1, 4}, } { fmt.Printf("arr%v = %v\n", i, f64.Value(i)) } }
Output: arr[0 0] = 1 arr[0 1] = 3 arr[0 2] = 5 arr[0 3] = 7 arr[0 4] = 9 arr[1 0] = 2 arr[1 1] = 4 arr[1 2] = 6 arr[1 3] = 8 arr[1 4] = 10
Example (FromMemory) ¶
This example demonstrates creating an array, sourcing the values and null bitmaps directly from byte slices. The null count is set to UnknownNullCount, instructing the array to calculate the null count from the bitmap when NullN is called.
package main import ( "fmt" "git.sr.ht/~sbinet/go-arrow/array" "git.sr.ht/~sbinet/go-arrow/memory" ) func main() { // create LSB packed bits with the following pattern: // 01010011 11000101 data := memory.NewBufferBytes([]byte{0xca, 0xa3}) // create LSB packed validity (null) bitmap, where every 4th element is null: // 11101110 11101110 nullBitmap := memory.NewBufferBytes([]byte{0x77, 0x77}) // Create a boolean array and lazily determine NullN using UnknownNullCount bools := array.NewBoolean(16, data, nullBitmap, array.UnknownNullCount) defer bools.Release() // Show the null count fmt.Printf("NullN() = %d\n", bools.NullN()) // Enumerate the values. n := bools.Len() for i := 0; i < n; i++ { fmt.Printf("bools[%d] = ", i) if bools.IsNull(i) { fmt.Println("(null)") } else { fmt.Printf("%t\n", bools.Value(i)) } } }
Output: NullN() = 4 bools[0] = false bools[1] = true bools[2] = false bools[3] = (null) bools[4] = false bools[5] = false bools[6] = true bools[7] = (null) bools[8] = true bools[9] = true bools[10] = false bools[11] = (null) bools[12] = false bools[13] = true bools[14] = false bools[15] = (null)
Example (ListArray) ¶
This example shows how to create a List array. The resulting array should be:
[[0, 1, 2], [], [3], [4, 5], [6, 7, 8], [], [9]]
pool := memory.NewGoAllocator() lb := array.NewListBuilder(pool, arrow.PrimitiveTypes.Int64) defer lb.Release() vb := lb.ValueBuilder().(*array.Int64Builder) defer vb.Release() vb.Reserve(10) lb.Append(true) vb.Append(0) vb.Append(1) vb.Append(2) lb.AppendNull() lb.Append(true) vb.Append(3) lb.Append(true) vb.Append(4) vb.Append(5) lb.Append(true) vb.Append(6) vb.Append(7) vb.Append(8) lb.AppendNull() lb.Append(true) vb.Append(9) arr := lb.NewArray().(*array.List) defer arr.Release() fmt.Printf("NullN() = %d\n", arr.NullN()) fmt.Printf("Len() = %d\n", arr.Len()) fmt.Printf("Offsets() = %v\n", arr.Offsets()) offsets := arr.Offsets()[1:] varr := arr.ListValues().(*array.Int64) pos := 0 for i := 0; i < arr.Len(); i++ { if !arr.IsValid(i) { fmt.Printf("List[%d] = (null)\n", i) continue } fmt.Printf("List[%d] = [", i) for j := pos; j < int(offsets[i]); j++ { if j != pos { fmt.Printf(", ") } fmt.Printf("%v", varr.Value(j)) } pos = int(offsets[i]) fmt.Printf("]\n") } fmt.Printf("List = %v\n", arr)
Output: NullN() = 2 Len() = 7 Offsets() = [0 3 3 4 6 9 9 10] List[0] = [0, 1, 2] List[1] = (null) List[2] = [3] List[3] = [4, 5] List[4] = [6, 7, 8] List[5] = (null) List[6] = [9] List = [[0 1 2] (null) [3] [4 5] [6 7 8] (null) [9]]
Example (Minimal) ¶
This example demonstrates how to build an array of int64 values using a builder and Append. Whilst convenient for small arrays,
package main import ( "fmt" "git.sr.ht/~sbinet/go-arrow/array" "git.sr.ht/~sbinet/go-arrow/memory" ) func main() { // Create an allocator. pool := memory.NewGoAllocator() // Create an int64 array builder. builder := array.NewInt64Builder(pool) defer builder.Release() builder.Append(1) builder.Append(2) builder.Append(3) builder.AppendNull() builder.Append(5) builder.Append(6) builder.Append(7) builder.Append(8) // Finish building the int64 array and reset the builder. ints := builder.NewInt64Array() defer ints.Release() // Enumerate the values. for i, v := range ints.Int64Values() { fmt.Printf("ints[%d] = ", i) if ints.IsNull(i) { fmt.Println("(null)") } else { fmt.Println(v) } } fmt.Printf("ints = %v\n", ints) }
Output: ints[0] = 1 ints[1] = 2 ints[2] = 3 ints[3] = (null) ints[4] = 5 ints[5] = 6 ints[6] = 7 ints[7] = 8 ints = [1 2 3 (null) 5 6 7 8]
Example (Record) ¶
pool := memory.NewGoAllocator() schema := arrow.NewSchema( []arrow.Field{ arrow.Field{Name: "f1-i32", Type: arrow.PrimitiveTypes.Int32}, arrow.Field{Name: "f2-f64", Type: arrow.PrimitiveTypes.Float64}, }, nil, ) b := array.NewRecordBuilder(pool, schema) defer b.Release() b.Field(0).(*array.Int32Builder).AppendValues([]int32{1, 2, 3, 4, 5, 6}, nil) b.Field(0).(*array.Int32Builder).AppendValues([]int32{7, 8, 9, 10}, []bool{true, true, false, true}) b.Field(1).(*array.Float64Builder).AppendValues([]float64{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, nil) rec := b.NewRecord() defer rec.Release() for i, col := range rec.Columns() { fmt.Printf("column[%d] %q: %v\n", i, rec.ColumnName(i), col) }
Output: column[0] "f1-i32": [1 2 3 4 5 6 7 8 (null) 10] column[1] "f2-f64": [1 2 3 4 5 6 7 8 9 10]
Example (RecordReader) ¶
pool := memory.NewGoAllocator() schema := arrow.NewSchema( []arrow.Field{ arrow.Field{Name: "f1-i32", Type: arrow.PrimitiveTypes.Int32}, arrow.Field{Name: "f2-f64", Type: arrow.PrimitiveTypes.Float64}, }, nil, ) b := array.NewRecordBuilder(pool, schema) defer b.Release() b.Field(0).(*array.Int32Builder).AppendValues([]int32{1, 2, 3, 4, 5, 6}, nil) b.Field(0).(*array.Int32Builder).AppendValues([]int32{7, 8, 9, 10}, []bool{true, true, false, true}) b.Field(1).(*array.Float64Builder).AppendValues([]float64{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, nil) rec1 := b.NewRecord() defer rec1.Release() b.Field(0).(*array.Int32Builder).AppendValues([]int32{11, 12, 13, 14, 15, 16, 17, 18, 19, 20}, nil) b.Field(1).(*array.Float64Builder).AppendValues([]float64{11, 12, 13, 14, 15, 16, 17, 18, 19, 20}, nil) rec2 := b.NewRecord() defer rec2.Release() itr, err := array.NewRecordReader(schema, []array.Record{rec1, rec2}) if err != nil { log.Fatal(err) } defer itr.Release() n := 0 for itr.Next() { rec := itr.Record() for i, col := range rec.Columns() { fmt.Printf("rec[%d][%q]: %v\n", n, rec.ColumnName(i), col) } n++ }
Output: rec[0]["f1-i32"]: [1 2 3 4 5 6 7 8 (null) 10] rec[0]["f2-f64"]: [1 2 3 4 5 6 7 8 9 10] rec[1]["f1-i32"]: [11 12 13 14 15 16 17 18 19 20] rec[1]["f2-f64"]: [11 12 13 14 15 16 17 18 19 20]
Example (StructArray) ¶
This example shows how to create a Struct array. The resulting array should be:
[{‘joe’, 1}, {null, 2}, null, {‘mark’, 4}]
pool := memory.NewGoAllocator() dtype := arrow.StructOf([]arrow.Field{ {Name: "f1", Type: arrow.ListOf(arrow.PrimitiveTypes.Uint8)}, {Name: "f2", Type: arrow.PrimitiveTypes.Int32}, }...) sb := array.NewStructBuilder(pool, dtype) defer sb.Release() f1b := sb.FieldBuilder(0).(*array.ListBuilder) defer f1b.Release() f1vb := f1b.ValueBuilder().(*array.Uint8Builder) defer f1vb.Release() f2b := sb.FieldBuilder(1).(*array.Int32Builder) defer f2b.Release() sb.Reserve(4) f1vb.Reserve(7) f2b.Reserve(3) sb.Append(true) f1b.Append(true) f1vb.AppendValues([]byte("joe"), nil) f2b.Append(1) sb.Append(true) f1b.AppendNull() f2b.Append(2) sb.AppendNull() sb.Append(true) f1b.Append(true) f1vb.AppendValues([]byte("mark"), nil) f2b.Append(4) arr := sb.NewArray().(*array.Struct) defer arr.Release() fmt.Printf("NullN() = %d\n", arr.NullN()) fmt.Printf("Len() = %d\n", arr.Len()) list := arr.Field(0).(*array.List) defer list.Release() offsets := list.Offsets() varr := list.ListValues().(*array.Uint8) defer varr.Release() ints := arr.Field(1).(*array.Int32) defer ints.Release() for i := 0; i < arr.Len(); i++ { if !arr.IsValid(i) { fmt.Printf("Struct[%d] = (null)\n", i) continue } fmt.Printf("Struct[%d] = [", i) pos := int(offsets[i]) switch { case list.IsValid(pos): fmt.Printf("[") for j := offsets[i]; j < offsets[i+1]; j++ { if j != offsets[i] { fmt.Printf(", ") } fmt.Printf("%v", string(varr.Value(int(j)))) } fmt.Printf("], ") default: fmt.Printf("(null), ") } fmt.Printf("%d]\n", ints.Value(i)) }
Output: NullN() = 1 Len() = 4 Struct[0] = [[j, o, e], 1] Struct[1] = [[], 2] Struct[2] = (null) Struct[3] = [[m, a, r, k], 4]
Example (Table) ¶
pool := memory.NewGoAllocator() schema := arrow.NewSchema( []arrow.Field{ arrow.Field{Name: "f1-i32", Type: arrow.PrimitiveTypes.Int32}, arrow.Field{Name: "f2-f64", Type: arrow.PrimitiveTypes.Float64}, }, nil, ) b := array.NewRecordBuilder(pool, schema) defer b.Release() b.Field(0).(*array.Int32Builder).AppendValues([]int32{1, 2, 3, 4, 5, 6}, nil) b.Field(0).(*array.Int32Builder).AppendValues([]int32{7, 8, 9, 10}, []bool{true, true, false, true}) b.Field(1).(*array.Float64Builder).AppendValues([]float64{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, nil) rec1 := b.NewRecord() defer rec1.Release() b.Field(0).(*array.Int32Builder).AppendValues([]int32{11, 12, 13, 14, 15, 16, 17, 18, 19, 20}, nil) b.Field(1).(*array.Float64Builder).AppendValues([]float64{11, 12, 13, 14, 15, 16, 17, 18, 19, 20}, nil) rec2 := b.NewRecord() defer rec2.Release() tbl := array.NewTableFromRecords(schema, []array.Record{rec1, rec2}) defer tbl.Release() tr := array.NewTableReader(tbl, 5) defer tr.Release() n := 0 for tr.Next() { rec := tr.Record() for i, col := range rec.Columns() { fmt.Printf("rec[%d][%q]: %v\n", n, rec.ColumnName(i), col) } n++ }
Output: rec[0]["f1-i32"]: [1 2 3 4 5] rec[0]["f2-f64"]: [1 2 3 4 5] rec[1]["f1-i32"]: [6 7 8 (null) 10] rec[1]["f2-f64"]: [6 7 8 9 10] rec[2]["f1-i32"]: [11 12 13 14 15] rec[2]["f2-f64"]: [11 12 13 14 15] rec[3]["f1-i32"]: [16 17 18 19 20] rec[3]["f2-f64"]: [16 17 18 19 20]
Index ¶
- Constants
- Variables
- func TypeEqual(left, right DataType, opts ...TypeEqualOption) bool
- type BinaryDataType
- type BinaryType
- type BooleanType
- type DataType
- type Date32
- type Date32Type
- type Date64
- type Date64Type
- type DayTimeInterval
- type DayTimeIntervalType
- type Decimal128Type
- type Duration
- type DurationType
- type Field
- type FixedSizeBinaryType
- type FixedSizeListType
- type FixedWidthDataType
- type Float16Type
- type Float32Type
- type Float64Type
- type Int16Type
- type Int32Type
- type Int64Type
- type Int8Type
- type ListType
- type Metadata
- type MonthInterval
- type MonthIntervalType
- type NullType
- type Schema
- func (sc *Schema) Equal(o *Schema) bool
- func (sc *Schema) Field(i int) Field
- func (sc *Schema) FieldIndices(n string) []int
- func (sc *Schema) Fields() []Field
- func (sc *Schema) FieldsByName(n string) ([]Field, bool)
- func (sc *Schema) HasField(n string) bool
- func (sc *Schema) HasMetadata() bool
- func (sc *Schema) Metadata() Metadata
- func (s *Schema) String() string
- type StringType
- type StructType
- type Time32
- type Time32Type
- type Time64
- type Time64Type
- type TimeUnit
- type Timestamp
- type TimestampType
- type Type
- type TypeEqualOption
- type Uint16Type
- type Uint32Type
- type Uint64Type
- type Uint8Type
Examples ¶
Constants ¶
const ( // Date32SizeBytes specifies the number of bytes required to store a single Date32 in memory Date32SizeBytes = int(unsafe.Sizeof(Date32(0))) )
const ( // Date64SizeBytes specifies the number of bytes required to store a single Date64 in memory Date64SizeBytes = int(unsafe.Sizeof(Date64(0))) )
const ( // DayTimeIntervalSizeBytes specifies the number of bytes required to store a single DayTimeInterval in memory DayTimeIntervalSizeBytes = int(unsafe.Sizeof(DayTimeInterval{})) )
const ( // Decimal128SizeBytes specifies the number of bytes required to store a single decimal128 in memory Decimal128SizeBytes = int(unsafe.Sizeof(decimal128.Num{})) )
const ( // DurationSizeBytes specifies the number of bytes required to store a single Duration in memory DurationSizeBytes = int(unsafe.Sizeof(Duration(0))) )
const ( // Float16SizeBytes specifies the number of bytes required to store a single float16 in memory Float16SizeBytes = int(unsafe.Sizeof(uint16(0))) )
const ( // Float32SizeBytes specifies the number of bytes required to store a single float32 in memory Float32SizeBytes = int(unsafe.Sizeof(float32(0))) )
const ( // Float64SizeBytes specifies the number of bytes required to store a single float64 in memory Float64SizeBytes = int(unsafe.Sizeof(float64(0))) )
const ( // Int16SizeBytes specifies the number of bytes required to store a single int16 in memory Int16SizeBytes = int(unsafe.Sizeof(int16(0))) )
const ( // Int32SizeBytes specifies the number of bytes required to store a single int32 in memory Int32SizeBytes = int(unsafe.Sizeof(int32(0))) )
const ( // Int64SizeBytes specifies the number of bytes required to store a single int64 in memory Int64SizeBytes = int(unsafe.Sizeof(int64(0))) )
const ( // Int8SizeBytes specifies the number of bytes required to store a single int8 in memory Int8SizeBytes = int(unsafe.Sizeof(int8(0))) )
const ( // MonthIntervalSizeBytes specifies the number of bytes required to store a single MonthInterval in memory MonthIntervalSizeBytes = int(unsafe.Sizeof(MonthInterval(0))) )
const ( // Time32SizeBytes specifies the number of bytes required to store a single Time32 in memory Time32SizeBytes = int(unsafe.Sizeof(Time32(0))) )
const ( // Time64SizeBytes specifies the number of bytes required to store a single Time64 in memory Time64SizeBytes = int(unsafe.Sizeof(Time64(0))) )
const ( // TimestampSizeBytes specifies the number of bytes required to store a single Timestamp in memory TimestampSizeBytes = int(unsafe.Sizeof(Timestamp(0))) )
const ( // Uint16SizeBytes specifies the number of bytes required to store a single uint16 in memory Uint16SizeBytes = int(unsafe.Sizeof(uint16(0))) )
const ( // Uint32SizeBytes specifies the number of bytes required to store a single uint32 in memory Uint32SizeBytes = int(unsafe.Sizeof(uint32(0))) )
const ( // Uint64SizeBytes specifies the number of bytes required to store a single uint64 in memory Uint64SizeBytes = int(unsafe.Sizeof(uint64(0))) )
const ( // Uint8SizeBytes specifies the number of bytes required to store a single uint8 in memory Uint8SizeBytes = int(unsafe.Sizeof(uint8(0))) )
Variables ¶
var ( MonthIntervalTraits monthTraits DayTimeIntervalTraits daytimeTraits )
var ( Int64Traits int64Traits Uint64Traits uint64Traits Float64Traits float64Traits Int32Traits int32Traits Uint32Traits uint32Traits Float32Traits float32Traits Int16Traits int16Traits Uint16Traits uint16Traits Int8Traits int8Traits Uint8Traits uint8Traits TimestampTraits timestampTraits Time32Traits time32Traits Time64Traits time64Traits Date32Traits date32Traits Date64Traits date64Traits DurationTraits durationTraits )
var ( BinaryTypes = struct { Binary BinaryDataType String BinaryDataType }{ Binary: &BinaryType{}, String: &StringType{}, } )
var BooleanTraits booleanTraits
var Decimal128Traits decimal128Traits
Decimal128 traits
var ( FixedWidthTypes = struct { Boolean FixedWidthDataType Date32 FixedWidthDataType Date64 FixedWidthDataType DayTimeInterval FixedWidthDataType Duration_s FixedWidthDataType Duration_ms FixedWidthDataType Duration_us FixedWidthDataType Duration_ns FixedWidthDataType Float16 FixedWidthDataType MonthInterval FixedWidthDataType Time32s FixedWidthDataType Time32ms FixedWidthDataType Time64us FixedWidthDataType Time64ns FixedWidthDataType Timestamp_s FixedWidthDataType Timestamp_ms FixedWidthDataType Timestamp_us FixedWidthDataType Timestamp_ns FixedWidthDataType }{ Boolean: &BooleanType{}, Date32: &Date32Type{}, Date64: &Date64Type{}, DayTimeInterval: &DayTimeIntervalType{}, Duration_s: &DurationType{Unit: Second}, Duration_ms: &DurationType{Unit: Millisecond}, Duration_us: &DurationType{Unit: Microsecond}, Duration_ns: &DurationType{Unit: Nanosecond}, Float16: &Float16Type{}, MonthInterval: &MonthIntervalType{}, Time32s: &Time32Type{Unit: Second}, Time32ms: &Time32Type{Unit: Millisecond}, Time64us: &Time64Type{Unit: Microsecond}, Time64ns: &Time64Type{Unit: Nanosecond}, Timestamp_s: &TimestampType{Unit: Second, TimeZone: "UTC"}, Timestamp_ms: &TimestampType{Unit: Millisecond, TimeZone: "UTC"}, Timestamp_us: &TimestampType{Unit: Microsecond, TimeZone: "UTC"}, Timestamp_ns: &TimestampType{Unit: Nanosecond, TimeZone: "UTC"}, } )
var Float16Traits float16Traits
Float16 traits
var ( PrimitiveTypes = struct { Int8 DataType Int16 DataType Int32 DataType Int64 DataType Uint8 DataType Uint16 DataType Uint32 DataType Uint64 DataType Float32 DataType Float64 DataType Date32 DataType Date64 DataType }{ Int8: &Int8Type{}, Int16: &Int16Type{}, Int32: &Int32Type{}, Int64: &Int64Type{}, Uint8: &Uint8Type{}, Uint16: &Uint16Type{}, Uint32: &Uint32Type{}, Uint64: &Uint64Type{}, Float32: &Float32Type{}, Float64: &Float64Type{}, Date32: &Date32Type{}, Date64: &Date64Type{}, } )
Functions ¶
func TypeEqual ¶
func TypeEqual(left, right DataType, opts ...TypeEqualOption) bool
TypeEqual checks if two DataType are the same, optionally checking metadata equality for STRUCT types.
Types ¶
type BinaryDataType ¶
type BinaryDataType interface { DataType // contains filtered or unexported methods }
type BinaryType ¶
type BinaryType struct{}
func (*BinaryType) ID ¶
func (t *BinaryType) ID() Type
func (*BinaryType) Name ¶
func (t *BinaryType) Name() string
func (*BinaryType) String ¶
func (t *BinaryType) String() string
type BooleanType ¶
type BooleanType struct{}
func (*BooleanType) BitWidth ¶
func (t *BooleanType) BitWidth() int
BitWidth returns the number of bits required to store a single element of this data type in memory.
func (*BooleanType) ID ¶
func (t *BooleanType) ID() Type
func (*BooleanType) Name ¶
func (t *BooleanType) Name() string
func (*BooleanType) String ¶
func (t *BooleanType) String() string
type Date32Type ¶
type Date32Type struct{}
func (*Date32Type) BitWidth ¶
func (t *Date32Type) BitWidth() int
func (*Date32Type) ID ¶
func (t *Date32Type) ID() Type
func (*Date32Type) Name ¶
func (t *Date32Type) Name() string
func (*Date32Type) String ¶
func (t *Date32Type) String() string
type Date64Type ¶
type Date64Type struct{}
func (*Date64Type) BitWidth ¶
func (t *Date64Type) BitWidth() int
func (*Date64Type) ID ¶
func (t *Date64Type) ID() Type
func (*Date64Type) Name ¶
func (t *Date64Type) Name() string
func (*Date64Type) String ¶
func (t *Date64Type) String() string
type DayTimeInterval ¶
DayTimeInterval represents a number of days and milliseconds (fraction of day).
type DayTimeIntervalType ¶
type DayTimeIntervalType struct{}
DayTimeIntervalType is encoded as a pair of 32-bit signed integer, representing a number of days and milliseconds (fraction of day).
func (*DayTimeIntervalType) BitWidth ¶
func (t *DayTimeIntervalType) BitWidth() int
BitWidth returns the number of bits required to store a single element of this data type in memory.
func (*DayTimeIntervalType) ID ¶
func (*DayTimeIntervalType) ID() Type
func (*DayTimeIntervalType) Name ¶
func (*DayTimeIntervalType) Name() string
func (*DayTimeIntervalType) String ¶
func (*DayTimeIntervalType) String() string
type Decimal128Type ¶
Decimal128Type represents a fixed-size 128-bit decimal type.
func (*Decimal128Type) BitWidth ¶
func (*Decimal128Type) BitWidth() int
func (*Decimal128Type) ID ¶
func (*Decimal128Type) ID() Type
func (*Decimal128Type) Name ¶
func (*Decimal128Type) Name() string
func (*Decimal128Type) String ¶
func (t *Decimal128Type) String() string
type DurationType ¶
type DurationType struct {
Unit TimeUnit
}
DurationType is encoded as a 64-bit signed integer, representing an amount of elapsed time without any relation to a calendar artifact.
func (*DurationType) BitWidth ¶
func (*DurationType) BitWidth() int
func (*DurationType) ID ¶
func (*DurationType) ID() Type
func (*DurationType) Name ¶
func (*DurationType) Name() string
func (*DurationType) String ¶
func (t *DurationType) String() string
type Field ¶
type Field struct { Name string // Field name Type DataType // The field's data type Nullable bool // Fields can be nullable Metadata Metadata // The field's metadata, if any }
func (Field) HasMetadata ¶
type FixedSizeBinaryType ¶
type FixedSizeBinaryType struct {
ByteWidth int
}
func (*FixedSizeBinaryType) BitWidth ¶
func (t *FixedSizeBinaryType) BitWidth() int
func (*FixedSizeBinaryType) ID ¶
func (*FixedSizeBinaryType) ID() Type
func (*FixedSizeBinaryType) Name ¶
func (*FixedSizeBinaryType) Name() string
func (*FixedSizeBinaryType) String ¶
func (t *FixedSizeBinaryType) String() string
type FixedSizeListType ¶
type FixedSizeListType struct {
// contains filtered or unexported fields
}
FixedSizeListType describes a nested type in which each array slot contains a fixed-size sequence of values, all having the same relative type.
func FixedSizeListOf ¶
func FixedSizeListOf(n int32, t DataType) *FixedSizeListType
FixedSizeListOf returns the list type with element type t. For example, if t represents int32, FixedSizeListOf(10, t) represents [10]int32.
FixedSizeListOf panics if t is nil or invalid. FixedSizeListOf panics if n is <= 0.
func (*FixedSizeListType) Elem ¶
func (t *FixedSizeListType) Elem() DataType
Elem returns the FixedSizeListType's element type.
func (*FixedSizeListType) ID ¶
func (*FixedSizeListType) ID() Type
func (*FixedSizeListType) Len ¶
func (t *FixedSizeListType) Len() int32
Len returns the FixedSizeListType's size.
func (*FixedSizeListType) Name ¶
func (*FixedSizeListType) Name() string
func (*FixedSizeListType) String ¶
func (t *FixedSizeListType) String() string
type FixedWidthDataType ¶
type FixedWidthDataType interface { DataType // BitWidth returns the number of bits required to store a single element of this data type in memory. BitWidth() int }
FixedWidthDataType is the representation of an Arrow type that requires a fixed number of bits in memory for each element.
type Float16Type ¶
type Float16Type struct{}
Float16Type represents a floating point value encoded with a 16-bit precision.
func (*Float16Type) BitWidth ¶
func (t *Float16Type) BitWidth() int
BitWidth returns the number of bits required to store a single element of this data type in memory.
func (*Float16Type) ID ¶
func (t *Float16Type) ID() Type
func (*Float16Type) Name ¶
func (t *Float16Type) Name() string
func (*Float16Type) String ¶
func (t *Float16Type) String() string
type Float32Type ¶
type Float32Type struct{}
func (*Float32Type) BitWidth ¶
func (t *Float32Type) BitWidth() int
func (*Float32Type) ID ¶
func (t *Float32Type) ID() Type
func (*Float32Type) Name ¶
func (t *Float32Type) Name() string
func (*Float32Type) String ¶
func (t *Float32Type) String() string
type Float64Type ¶
type Float64Type struct{}
func (*Float64Type) BitWidth ¶
func (t *Float64Type) BitWidth() int
func (*Float64Type) ID ¶
func (t *Float64Type) ID() Type
func (*Float64Type) Name ¶
func (t *Float64Type) Name() string
func (*Float64Type) String ¶
func (t *Float64Type) String() string
type ListType ¶
type ListType struct {
// contains filtered or unexported fields
}
ListType describes a nested type in which each array slot contains a variable-size sequence of values, all having the same relative type.
func ListOf ¶
ListOf returns the list type with element type t. For example, if t represents int32, ListOf(t) represents []int32.
ListOf panics if t is nil or invalid.
type Metadata ¶
type Metadata struct {
// contains filtered or unexported fields
}
func MetadataFrom ¶
func NewMetadata ¶
type MonthIntervalType ¶
type MonthIntervalType struct{}
MonthIntervalType is encoded as a 32-bit signed integer, representing a number of months.
func (*MonthIntervalType) BitWidth ¶
func (t *MonthIntervalType) BitWidth() int
BitWidth returns the number of bits required to store a single element of this data type in memory.
func (*MonthIntervalType) ID ¶
func (*MonthIntervalType) ID() Type
func (*MonthIntervalType) Name ¶
func (*MonthIntervalType) Name() string
func (*MonthIntervalType) String ¶
func (*MonthIntervalType) String() string
type NullType ¶
type NullType struct{}
NullType describes a degenerate array, with zero physical storage.
var (
Null *NullType
)
type Schema ¶
type Schema struct {
// contains filtered or unexported fields
}
Schema is a sequence of Field values, describing the columns of a table or a record batch.
func NewSchema ¶
NewSchema returns a new Schema value from the slice of fields and metadata.
NewSchema panics if there is a field with an invalid DataType.
func (*Schema) Equal ¶
Equal returns whether two schema are equal. Equal does not compare the metadata.
func (*Schema) FieldIndices ¶
FieldIndices returns the indices of the named field or nil.
func (*Schema) HasMetadata ¶
type StringType ¶
type StringType struct{}
func (*StringType) ID ¶
func (t *StringType) ID() Type
func (*StringType) Name ¶
func (t *StringType) Name() string
func (*StringType) String ¶
func (t *StringType) String() string
type StructType ¶
type StructType struct {
// contains filtered or unexported fields
}
StructType describes a nested type parameterized by an ordered sequence of relative types, called its fields.
func StructOf ¶
func StructOf(fs ...Field) *StructType
StructOf returns the struct type with fields fs.
StructOf panics if there are duplicated fields. StructOf panics if there is a field with an invalid DataType.
func (*StructType) Field ¶
func (t *StructType) Field(i int) Field
func (*StructType) FieldByName ¶
func (t *StructType) FieldByName(name string) (Field, bool)
func (*StructType) Fields ¶
func (t *StructType) Fields() []Field
func (*StructType) ID ¶
func (*StructType) ID() Type
func (*StructType) Name ¶
func (*StructType) Name() string
func (*StructType) String ¶
func (t *StructType) String() string
type Time32Type ¶
type Time32Type struct {
Unit TimeUnit
}
Time32Type is encoded as a 32-bit signed integer, representing either seconds or milliseconds since midnight.
func (*Time32Type) BitWidth ¶
func (*Time32Type) BitWidth() int
func (*Time32Type) ID ¶
func (*Time32Type) ID() Type
func (*Time32Type) Name ¶
func (*Time32Type) Name() string
func (*Time32Type) String ¶
func (t *Time32Type) String() string
type Time64Type ¶
type Time64Type struct {
Unit TimeUnit
}
Time64Type is encoded as a 64-bit signed integer, representing either microseconds or nanoseconds since midnight.
func (*Time64Type) BitWidth ¶
func (*Time64Type) BitWidth() int
func (*Time64Type) ID ¶
func (*Time64Type) ID() Type
func (*Time64Type) Name ¶
func (*Time64Type) Name() string
func (*Time64Type) String ¶
func (t *Time64Type) String() string
type TimestampType ¶
TimestampType is encoded as a 64-bit signed integer since the UNIX epoch (2017-01-01T00:00:00Z). The zero-value is a nanosecond and time zone neutral. Time zone neutral can be considered UTC without having "UTC" as a time zone.
func (*TimestampType) BitWidth ¶
func (*TimestampType) BitWidth() int
BitWidth returns the number of bits required to store a single element of this data type in memory.
func (*TimestampType) ID ¶
func (*TimestampType) ID() Type
func (*TimestampType) Name ¶
func (*TimestampType) Name() string
func (*TimestampType) String ¶
func (t *TimestampType) String() string
type Type ¶
type Type int
Type is a logical type. They can be expressed as either a primitive physical type (bytes or bits of some fixed size), a nested type consisting of other data types, or another data type (e.g. a timestamp encoded as an int64)
const ( // NULL type having no physical storage NULL Type = iota // BOOL is a 1 bit, LSB bit-packed ordering BOOL // UINT8 is an Unsigned 8-bit little-endian integer UINT8 // INT8 is a Signed 8-bit little-endian integer INT8 // UINT16 is an Unsigned 16-bit little-endian integer UINT16 // INT16 is a Signed 16-bit little-endian integer INT16 // UINT32 is an Unsigned 32-bit little-endian integer UINT32 // INT32 is a Signed 32-bit little-endian integer INT32 // UINT64 is an Unsigned 64-bit little-endian integer UINT64 // INT64 is a Signed 64-bit little-endian integer INT64 // FLOAT16 is a 2-byte floating point value FLOAT16 // FLOAT32 is a 4-byte floating point value FLOAT32 // FLOAT64 is an 8-byte floating point value FLOAT64 // STRING is a UTF8 variable-length string STRING // BINARY is a Variable-length byte type (no guarantee of UTF8-ness) BINARY // FIXED_SIZE_BINARY is a binary where each value occupies the same number of bytes FIXED_SIZE_BINARY // DATE32 is int32 days since the UNIX epoch DATE32 // DATE64 is int64 milliseconds since the UNIX epoch DATE64 // TIMESTAMP is an exact timestamp encoded with int64 since UNIX epoch // Default unit millisecond TIMESTAMP // TIME32 is a signed 32-bit integer, representing either seconds or // milliseconds since midnight TIME32 // TIME64 is a signed 64-bit integer, representing either microseconds or // nanoseconds since midnight TIME64 // INTERVAL is YEAR_MONTH or DAY_TIME interval in SQL style INTERVAL // DECIMAL is a precision- and scale-based decimal type. Storage type depends on the // parameters. DECIMAL // LIST is a list of some logical data type LIST // STRUCT of logical types STRUCT // UNION of logical types UNION // DICTIONARY aka Category type DICTIONARY // MAP is a repeated struct logical type MAP // Custom data type, implemented by user EXTENSION // Fixed size list of some logical type FIXED_SIZE_LIST // Measure of elapsed time in either seconds, milliseconds, microseconds // or nanoseconds. DURATION )
type TypeEqualOption ¶
type TypeEqualOption func(*typeEqualsConfig)
TypeEqualOption is a functional option type used for configuring type equality checks.
func CheckMetadata ¶
func CheckMetadata() TypeEqualOption
CheckMetadata is an option for TypeEqual that allows checking for metadata equality besides type equality. It only makes sense for STRUCT type.
type Uint16Type ¶
type Uint16Type struct{}
func (*Uint16Type) BitWidth ¶
func (t *Uint16Type) BitWidth() int
func (*Uint16Type) ID ¶
func (t *Uint16Type) ID() Type
func (*Uint16Type) Name ¶
func (t *Uint16Type) Name() string
func (*Uint16Type) String ¶
func (t *Uint16Type) String() string
type Uint32Type ¶
type Uint32Type struct{}
func (*Uint32Type) BitWidth ¶
func (t *Uint32Type) BitWidth() int
func (*Uint32Type) ID ¶
func (t *Uint32Type) ID() Type
func (*Uint32Type) Name ¶
func (t *Uint32Type) Name() string
func (*Uint32Type) String ¶
func (t *Uint32Type) String() string
type Uint64Type ¶
type Uint64Type struct{}
func (*Uint64Type) BitWidth ¶
func (t *Uint64Type) BitWidth() int
func (*Uint64Type) ID ¶
func (t *Uint64Type) ID() Type
func (*Uint64Type) Name ¶
func (t *Uint64Type) Name() string
func (*Uint64Type) String ¶
func (t *Uint64Type) String() string
Source Files ¶
Directories ¶
Path | Synopsis |
---|---|
_examples
|
|
_tools
|
|
Package array provides implementations of various Arrow array types.
|
Package array provides implementations of various Arrow array types. |
Package arrio exposes functions to manipulate records, exposing and using interfaces not unlike the ones defined in the stdlib io package.
|
Package arrio exposes functions to manipulate records, exposing and using interfaces not unlike the ones defined in the stdlib io package. |
Package csv reads CSV files and presents the extracted data as records, also writes data as record into CSV files
|
Package csv reads CSV files and presents the extracted data as records, also writes data as record into CSV files |
internal
|
|
arrdata
Package arrdata exports arrays and records data ready to be used for tests.
|
Package arrdata exports arrays and records data ready to be used for tests. |
arrjson
Package arrjson provides types and functions to encode and decode ARROW types and data to and from JSON files.
|
Package arrjson provides types and functions to encode and decode ARROW types and data to and from JSON files. |
cpu
Package cpu implements processor feature detection used by the Go standard library.
|
Package cpu implements processor feature detection used by the Go standard library. |
debug
Package debug provides APIs for conditional runtime assertions and debug logging.
|
Package debug provides APIs for conditional runtime assertions and debug logging. |
cmd/arrow-cat
Command arrow-cat displays the content of an Arrow stream or file.
|
Command arrow-cat displays the content of an Arrow stream or file. |
cmd/arrow-ls
Command arrow-ls displays the listing of an Arrow file.
|
Command arrow-ls displays the listing of an Arrow file. |
Package math provides optimized mathematical functions for processing Arrow arrays.
|
Package math provides optimized mathematical functions for processing Arrow arrays. |
Package memory provides support for allocating and manipulating memory at a low level.
|
Package memory provides support for allocating and manipulating memory at a low level. |
Package tensor provides types that implement n-dimensional arrays.
|
Package tensor provides types that implement n-dimensional arrays. |