arrowpb

package module
v1.6.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 2, 2025 License: MIT Imports: 22 Imported by: 0

README

arrowpb

Convert Apache Arrow records to Protocol Buffers

Go Report Card Build Status Go Reference

Examples

For examples, see the arrowpb-example repository.

Author

arrowpb is developed by @TFMV.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ArrowReaderToProtos added in v1.1.0

func ArrowReaderToProtos(ctx context.Context, reader array.RecordReader, msgDesc protoreflect.MessageDescriptor, cfg *ConvertConfig) ([][]byte, error)

func ArrowSchemaToFileDescriptorProto added in v1.1.0

func ArrowSchemaToFileDescriptorProto(schema *arrow.Schema, packageName, messagePrefix string, cfg *ConvertConfig) (*descriptorpb.FileDescriptorProto, error)

func CompileFileDescriptorProto added in v1.1.0

func CompileFileDescriptorProto(fdp *descriptorpb.FileDescriptorProto) (protoreflect.FileDescriptor, error)

CompileFileDescriptorProto merges your generated FileDescriptorProto with the WKT descriptors so that references to google.protobuf.Timestamp, etc., can be resolved.

func CompileFileDescriptorProtoWithRetry added in v1.1.0

func CompileFileDescriptorProtoWithRetry(fdp *descriptorpb.FileDescriptorProto) (protoreflect.FileDescriptor, error)

CompileFileDescriptorProtoWithRetry wraps the above in an exponential backoff.

func ConvertInParallel added in v1.1.0

func ConvertInParallel(ctx context.Context, record arrow.Record, msgDesc protoreflect.MessageDescriptor, concurrency int, cfg *ConvertConfig) ([][]byte, error)

ConvertInParallel processes a single Arrow Record in parallel, chunking row ranges.

func CreateArrowRecord added in v0.2.0

func CreateArrowRecord() (array.RecordReader, error)

func ExtractArrowValue added in v1.1.0

func ExtractArrowValue(col arrow.Array, rowIndex int) interface{}

func FormatArrowJSON

func FormatArrowJSON(reader array.RecordReader, output io.Writer) error

func GenerateUniqueMessageName added in v1.1.0

func GenerateUniqueMessageName(prefix string) string

func GetTopLevelMessageDescriptor added in v1.1.0

func GetTopLevelMessageDescriptor(fd protoreflect.FileDescriptor) (protoreflect.MessageDescriptor, error)

GetTopLevelMessageDescriptor fetches the first message from a compiled FileDescriptor.

func RecordToDynamicProtos added in v1.1.0

func RecordToDynamicProtos(rec arrow.Record, msgDesc protoreflect.MessageDescriptor, cfg *ConvertConfig) ([][]byte, error)

func RowToDynamicProto added in v1.1.0

func RowToDynamicProto(record arrow.Record, msgDesc protoreflect.MessageDescriptor, rowIndex int, cfg *ConvertConfig) (*dynamicpb.Message, error)

Types

type ConvertConfig added in v1.1.0

type ConvertConfig struct {
	UseWellKnownTimestamps bool // arrow.TIMESTAMP => google.protobuf.Timestamp
	UseProto2Syntax        bool // changes FileDescriptorProto syntax to "proto2"
	UseWrapperTypes        bool // arrow scalars => google.protobuf.*Value
	MapDictionariesToEnums bool // dictionary => enum

	// DescriptorCache can store repeated schemas => reuse of the same descriptor
	DescriptorCache sync.Map
}

ConvertConfig allows fine-grained control over Arrow => Protobuf schema generation.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL