cdata

package

v0.0.1 Latest Latest Go to latest Published: Aug 10, 2024 License: Apache-2.0, BSD-2-Clause, BSD-3-Clause, + 7 more Imports: 23 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/joe-at-startupmedia/go-arrow

Links

Open Source Insights

Documentation ¶

Index ¶

func ExportArrowArray(arr arrow.Array, out *CArrowArray, outSchema *CArrowSchema)
func ExportArrowRecordBatch(rb arrow.Record, out *CArrowArray, outSchema *CArrowSchema)
func ExportArrowSchema(schema *arrow.Schema, out *CArrowSchema)
func ExportRecordReader(reader array.RecordReader, out *CArrowArrayStream)
func ImportCArray(arr *CArrowArray, schema *CArrowSchema) (arrow.Field, arrow.Array, error)
func ImportCArrayStream(stream *CArrowArrayStream, schema *arrow.Schema) arrio.Readerdeprecated
func ImportCArrayWithType(arr *CArrowArray, dt arrow.DataType) (arrow.Array, error)
func ImportCArrowField(out *CArrowSchema) (arrow.Field, error)
func ImportCArrowSchema(out *CArrowSchema) (*arrow.Schema, error)
func ImportCRecordBatch(arr *CArrowArray, sc *CArrowSchema) (arrow.Record, error)
func ImportCRecordBatchWithSchema(arr *CArrowArray, sc *arrow.Schema) (arrow.Record, error)
func ImportCRecordReader(stream *CArrowArrayStream, schema *arrow.Schema) (arrio.Reader, error)
func ReleaseCArrowArray(arr *CArrowArray)
func ReleaseCArrowSchema(schema *CArrowSchema)
type CArrowArray
- func ArrayFromPtr(ptr uintptr) *CArrowArray
type CArrowArrayStream
type CArrowSchema
- func SchemaFromPtr(ptr uintptr) *CArrowSchema

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func ExportArrowArray ¶

func ExportArrowArray(arr arrow.Array, out *CArrowArray, outSchema *CArrowSchema)

ExportArrowArray populates the CArrowArray that is passed in with the pointers to the memory being used by the arrow.Array passed in, in order to share with zero-copy across the C Data Interface. See the documentation for ExportArrowRecordBatch for details on how to ensure you do not leak memory and prevent unwanted, undefined or strange behaviors.

WARNING: the output ArrowArray MUST BE ZERO INITIALIZED, or the Go garbage collector may error at runtime, due to CGO rules ("the current implementation may sometimes cause a runtime error if the contents of the C memory appear to be a Go pointer"). You have been warned!

func ExportArrowRecordBatch ¶

func ExportArrowRecordBatch(rb arrow.Record, out *CArrowArray, outSchema *CArrowSchema)

ExportArrowRecordBatch populates the passed in CArrowArray (and optionally the schema too) by sharing the memory used for the buffers of each column's arrays. It does not copy the data, and will internally increment the reference counters so that releasing the record will not free the memory prematurely.

When using CGO, memory passed to C is pinned so that the Go garbage collector won't move where it is allocated out from under the C pointer locations, ensuring the C pointers stay valid. This is only true until the CGO call returns, at which point the garbage collector is free to move things around again. As a result, if the function you're calling is going to hold onto the pointers or otherwise continue to reference the memory *after* the call returns, you should use the CgoArrowAllocator rather than the GoAllocator (or DefaultAllocator) so that the memory which is allocated for the record batch in the first place is allocated in C, not by the Go runtime and is therefore not subject to the Garbage collection.

The release function on the populated CArrowArray will properly decrease the reference counts, and release the memory if the record has already been released. But since this must be explicitly done, make sure it is released so that you do not create a memory leak.

WARNING: the output ArrowArray MUST BE ZERO INITIALIZED, or the Go garbage collector may error at runtime, due to CGO rules ("the current implementation may sometimes cause a runtime error if the contents of the C memory appear to be a Go pointer"). You have been warned!

func ExportArrowSchema ¶

func ExportArrowSchema(schema *arrow.Schema, out *CArrowSchema)

ExportArrowSchema populates the passed in CArrowSchema with the schema passed in so that it can be passed to some consumer of the C Data Interface. The `release` function is tied to a callback in order to properly release any memory that was allocated during the populating of the struct. Any memory allocated will be allocated using malloc which means that it is invisible to the Go Garbage Collector and must be freed manually using the callback on the CArrowSchema object.

WARNING: the output ArrowSchema MUST BE ZERO INITIALIZED, or the Go garbage collector may error at runtime, due to CGO rules ("the current implementation may sometimes cause a runtime error if the contents of the C memory appear to be a Go pointer"). You have been warned!

func ExportRecordReader ¶

func ExportRecordReader(reader array.RecordReader, out *CArrowArrayStream)

ExportRecordReader populates the CArrowArrayStream that is passed in with the appropriate callbacks to be a working ArrowArrayStream utilizing the passed in RecordReader. The CArrowArrayStream takes ownership of the RecordReader until the consumer calls the release callback, as such it is unnecessary to call Release on the passed in reader unless it has previously been retained.

WARNING: the output ArrowArrayStream MUST BE ZERO INITIALIZED, or the Go garbage collector may error at runtime, due to CGO rules ("the current implementation may sometimes cause a runtime error if the contents of the C memory appear to be a Go pointer"). You have been warned!

func ImportCArray ¶

func ImportCArray(arr *CArrowArray, schema *CArrowSchema) (arrow.Field, arrow.Array, error)

ImportCArray takes a pointer to both a C Data ArrowArray and C Data ArrowSchema in order to import them into usable Go Objects. If err is not nil, then ArrowArrayRelease must still be called on arr to release the memory. The ArrowSchemaRelease will be called on the passed in schema regardless of whether there is an error or not.

The Schema will be copied with the information used to populate the returned Field, complete with metadata. The array will reference the same memory that is referred to by the ArrowArray object and take ownership of it as per ImportCArrayWithType. The returned arrow.Array will own the C memory and call ArrowArrayRelease when the array.Data object is cleaned up.

NOTE: The array takes ownership of the underlying memory buffers via ArrowArrayMove, it does not take ownership of the actual arr object itself.

func ImportCArrayStream deprecated

func ImportCArrayStream(stream *CArrowArrayStream, schema *arrow.Schema) arrio.Reader

ImportCArrayStream creates an arrio.Reader from an ArrowArrayStream taking ownership of the underlying stream object via ArrowArrayStreamMove.

The records returned by this reader must be released manually after they are returned. The reader itself will release the stream via SetFinalizer when it is garbage collected. It will return (nil, io.EOF) from the Read function when there are no more records to return.

NOTE: The reader takes ownership of the underlying memory buffers via ArrowArrayStreamMove, it does not take ownership of the actual stream object itself.

Deprecated: This will panic if importing the schema fails (which is possible). Prefer ImportCRecordReader instead.

func ImportCArrayWithType ¶

func ImportCArrayWithType(arr *CArrowArray, dt arrow.DataType) (arrow.Array, error)

ImportCArrayWithType takes a pointer to a C Data ArrowArray and interprets the values as an array with the given datatype. If err is not nil, then ArrowArrayRelease must still be called on arr to release the memory.

The underlying buffers will not be copied, but will instead be referenced directly by the resulting array interface object. The passed in ArrowArray will have it's ownership transferred to the resulting arrow.Array via ArrowArrayMove. The underlying array.Data object that is owned by the Array will now be the owner of the memory pointer and will call ArrowArrayRelease when it is released and garbage collected via runtime.SetFinalizer.

NOTE: The array takes ownership of the underlying memory buffers via ArrowArrayMove, it does not take ownership of the actual arr object itself.

func ImportCArrowField ¶

func ImportCArrowField(out *CArrowSchema) (arrow.Field, error)

ImportCArrowField takes in an ArrowSchema from the C Data interface, it will copy the metadata and type definitions rather than keep direct references to them. It is safe to call C.ArrowSchemaRelease after receiving the field from this function.

func ImportCArrowSchema ¶

func ImportCArrowSchema(out *CArrowSchema) (*arrow.Schema, error)

ImportCArrowSchema takes in the ArrowSchema from the C Data Interface, it will copy the metadata and schema definitions over from the C object rather than keep direct references to them. This function will call ArrowSchemaRelease on the passed in schema regardless of whether or not there is an error returned.

This version is intended to take in a schema for a record batch, which means that the top level of the schema should be a struct of the schema fields. If importing a single array's schema, then use ImportCArrowField instead.

func ImportCRecordBatch ¶

func ImportCRecordBatch(arr *CArrowArray, sc *CArrowSchema) (arrow.Record, error)

ImportCRecordBatch imports an ArrowArray from C as a record batch. If err is not nil, then ArrowArrayRelease must still be called to release the memory.

A record batch is represented in the C Data Interface as a Struct Array whose fields are the columns of the record batch. Thus after importing the schema passed in here, if it is not a Struct type, this will return an error. As with ImportCArray, the columns in the record batch will take ownership of the CArrowArray memory if successful. Since ArrowArrayMove is used, it's still safe to call ArrowArrayRelease on the source regardless. But if there is an error, it *MUST* be called to ensure there is no memory leak.

NOTE: The array takes ownership of the underlying memory buffers via ArrowArrayMove, it does not take ownership of the actual arr object itself.

func ImportCRecordBatchWithSchema ¶

func ImportCRecordBatchWithSchema(arr *CArrowArray, sc *arrow.Schema) (arrow.Record, error)

ImportCRecordBatchWithSchema is used for importing a Record Batch array when the schema is already known such as when receiving record batches through a stream.

All of the semantics regarding memory ownership are the same as when calling ImportCRecordBatch directly with a schema.

NOTE: The array takes ownership of the underlying memory buffers via ArrowArrayMove, it does not take ownership of the actual arr object itself.

func ImportCRecordReader ¶

func ImportCRecordReader(stream *CArrowArrayStream, schema *arrow.Schema) (arrio.Reader, error)

ImportCStreamReader creates an arrio.Reader from an ArrowArrayStream taking ownership of the underlying stream object via ArrowArrayStreamMove.

The records returned by this reader must be released manually after they are returned. The reader itself will release the stream via SetFinalizer when it is garbage collected. It will return (nil, io.EOF) from the Read function when there are no more records to return.

NOTE: The reader takes ownership of the underlying memory buffers via ArrowArrayStreamMove, it does not take ownership of the actual stream object itself.

func ReleaseCArrowArray ¶

func ReleaseCArrowArray(arr *CArrowArray)

ReleaseCArrowArray calls ArrowArrayRelease on the passed in cdata array

func ReleaseCArrowSchema ¶

func ReleaseCArrowSchema(schema *CArrowSchema)

ReleaseCArrowSchema calls ArrowSchemaRelease on the passed in cdata schema

Types ¶

type CArrowArray ¶

type CArrowArray = C.struct_ArrowArray

CArrowArray is the C Data Interface object for Arrow Arrays as defined in abi.h

func ArrayFromPtr ¶

func ArrayFromPtr(ptr uintptr) *CArrowArray

ArrayFromPtr is a simple helper function to cast a uintptr to a *CArrowArray

type CArrowArrayStream ¶

type CArrowArrayStream = C.struct_ArrowArrayStream

CArrowArrayStream is the C Stream Interface object for handling streams of record batches.

type CArrowSchema ¶

type CArrowSchema = C.struct_ArrowSchema

CArrowSchema is the C Data Interface for ArrowSchemas defined in abi.h

func SchemaFromPtr ¶

func SchemaFromPtr(ptr uintptr) *CArrowSchema

SchemaFromPtr is a simple helper function to cast a uintptr to a *CArrowSchema

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL