arrow_schemagen

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 21, 2023 License: Apache-2.0 Imports: 7 Imported by: 0

README

arrow_schemagen

Generate an Apache Arrow schema from an Avro schema or an arbitrary map. Use with Apache Arrow Go package v13

How to use:

Pass in an Avro schema JSON and receive a *arrow.Schema

import (
	"github.com/apache/arrow/go/v13/arrow"
	asg "github.com/loicalleyne/arrow_schemagen"
)

func main() {
	var avroSchema map[string]interface{}
	json.Unmarshal([]byte(avroSchemaJSON), &avroSchema)
	//
	// ArrowSchemaFromAvro returns a new Arrow schema from an Avro schema JSON.
	// If the top level is of record type, set includeTopLevel to either make
	// its fields top level fields in the resulting schema or nested in a single field.
	//
	schema, err := asg.ArrowSchemaFromAvro(avroSchema, false)
	if err != nil {
		// deal with error
	}
	fmt.Printf("%v\n", schema.String())
}

Pass in a map[string]interface{} and receive a *arrow.Schema

import (
	"github.com/apache/arrow/go/v12/arrow"
	asg "github.com/loicalleyne/arrow_schemagen"
)

func main() {
	var sentReq = map[string]interface{}{
		"request": map[string]interface{}{
			"datetime":    "2021-07-27 02:59:59",
			"ip":          "34.67.160.53",
			"host":        "domain.com",
			"uri":         "/api/v1/get_stuff/xml",
			"request_uri": "/api/v1/get_stuff/xml",
			"referer":     "",
			"useragent":   "",
		},
		"resource": map[string]interface{}{
			"id": 86233,
			"ids": []interface{}{
				132, 453535, 13412341,
			},
			"external_id": "string:215426709",
			"width":       1080,
			"height":      1920,
		},
	}
	schema, err := asg.ArrowSchemaFromMap(sentReq)
	if err != nil {
		panic(err)
	}
	fmt.Printf("%v\n\n", schema.String())
}
schema:
  fields: 2
    - request: type=struct<request_uri: utf8, referer: utf8, useragent: utf8, datetime: utf8, ip: utf8, host: utf8, uri: utf8>
    - resource: type=struct<ids: list<item: int64, nullable>, external_id: utf8, width: int64, height: int64, id: int64>

Documentation

Overview

Package arrow_schemagen generates an Apache Arrow schema from an Apache Arrow schema or from a map[string]interface{}.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ArrowSchemaFromAvro

func ArrowSchemaFromAvro(avroSchema []byte, includeTopLevel bool) (*arrow.Schema, error)

ArrowSchemaFromAvro returns a new Arrow schema from an Avro schema JSON. If the top level is of record type, set includeTopLevel to either make its fields top level fields in the resulting schema or nested in a single field.

func ArrowSchemaFromMap

func ArrowSchemaFromMap(m map[string]interface{}) (*arrow.Schema, error)

ArrowSchemaFromMap returns a new Arrow schema from an arbitrary map[string]interface{}.

func AvroPrimitiveToArrowType

func AvroPrimitiveToArrowType(avroFieldType string) arrow.DataType

AvroPrimitiveToArrowType returns the Arrow DataType equivalent to a Avro primitive type.

NOTE: Arrow Binary type is used as a catchall to avoid potential data loss.

func GoPrimitiveToArrowType

func GoPrimitiveToArrowType(goType string) arrow.DataType

GoPrimitiveToArrowType returns the Arrow DataType equivalent to a Go primitive type.

NOTE The intended use case is to support the generation of an Arrow schema from arbitrary JSON unmarshaled to a map[string]interface{}. The same schema would then be reused for other JSON using the same schema, the field containing nil in the map used as a schema template could be populated in the subsequent JSON messages, therefore the Go nil type is mapped to Arrow Binary type as a catchall to avoid losing data.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL