avro

package module

v0.0.19 Latest Latest Go to latest Published: Aug 7, 2023 License: MIT Imports: 17 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/philpearl/avro

Links

Open Source Insights

README ¶

avro

avro is a decoder for Apache AVRO that decodes directly into Go structs and follows naming from JSON tags. It is intended primarily for decoding output from Google's Big Query.

https://avro.apache.org/docs/1.8.1/spec.html

Documentation ¶

Overview ¶

Package avro is an AVRO decoder aimed principly at decoding AVRO output from Google's Big Query. It decodes directly into Go structs, and uses json tags as naming hints.

The primary interface to the package is ReadFile. This reads an AVRO file, combining the schema in the file with type information from the struct passed via the out parameter to decode the records. It then passes an instance of a struct of type out to the callback cb for each record in the file.

You can implement custom decoders for your own types and register them via the Register function. github.com/phil/avro/null is an example of custom decoders for the types defined in github.com/unravelin/null

Index ¶

Variables
func ReadFile(r Reader, out interface{}, cb func(val unsafe.Pointer, rb *ResourceBank) error) error
func Register(typ reflect.Type, f CodecBuildFunc)
type BoolCodec
- func (BoolCodec) New(r *Buffer) unsafe.Pointer
- func (BoolCodec) Read(r *Buffer, p unsafe.Pointer) error
- func (BoolCodec) Skip(r *Buffer) error
type Buffer
- func NewBuffer(data []byte) *Buffer
- func (d *Buffer) Alloc(rtyp reflect.Type) unsafe.Pointer
- func (d *Buffer) ExtractResourceBank() *ResourceBank
- func (d *Buffer) Len() int
- func (d *Buffer) Next(l int) ([]byte, error)
- func (d *Buffer) NextAsString(l int) (string, error)
- func (d *Buffer) ReadByte() (byte, error)
- func (d *Buffer) Reset(data []byte)
- func (d *Buffer) Varint() (int64, error)
type BytesCodec
- func (BytesCodec) New(r *Buffer) unsafe.Pointer
- func (BytesCodec) Read(r *Buffer, ptr unsafe.Pointer) error
- func (BytesCodec) Skip(r *Buffer) error
type Codec
type CodecBuildFunc
type Compression
type DoubleCodec
type FileHeader
type FileWriter
- func NewFileWriter(schema []byte, compression Compression) (*FileWriter, error)
- func (f *FileWriter) AppendHeader(buf []byte) []byte
- func (f *FileWriter) WriteBlock(w io.Writer, rowCount int, block []byte) error
- func (f *FileWriter) WriteHeader(w io.Writer) error
type Float32DoubleCodec
- func (Float32DoubleCodec) New(r *Buffer) unsafe.Pointer
- func (c Float32DoubleCodec) Read(r *Buffer, p unsafe.Pointer) error
type FloatCodec
type Int16Codec
type Int32Codec
type Int64Codec
type IntCodec
- func (IntCodec[T]) New(r *Buffer) unsafe.Pointer
- func (IntCodec[T]) Read(r *Buffer, p unsafe.Pointer) error
- func (IntCodec[T]) Skip(r *Buffer) error
type MapCodec
- func (m *MapCodec) New(r *Buffer) unsafe.Pointer
- func (m *MapCodec) Read(r *Buffer, p unsafe.Pointer) error
- func (m *MapCodec) Skip(r *Buffer) error
type PointerCodec
- func (c *PointerCodec) New(r *Buffer) unsafe.Pointer
- func (c *PointerCodec) Read(r *Buffer, p unsafe.Pointer) error
type Reader
type ResourceBank
- func (rb *ResourceBank) Alloc(rtyp reflect.Type) unsafe.Pointer
- func (rb *ResourceBank) Close()
- func (rb *ResourceBank) ToString(in []byte) string
type Schema
- func FileSchema(filename string) (Schema, error)
- func SchemaFromString(in string) (Schema, error)
- func (s Schema) Codec(out interface{}) (Codec, error)
type SchemaObject
type SchemaRecordField
type StringCodec
- func (StringCodec) New(r *Buffer) unsafe.Pointer
- func (StringCodec) Read(r *Buffer, ptr unsafe.Pointer) error
- func (StringCodec) Skip(r *Buffer) error

Constants ¶

This section is empty.

Variables ¶

View Source

var FileMagic = [4]byte{'O', 'b', 'j', 1}

Functions ¶

func ReadFile ¶

func ReadFile(r Reader, out interface{}, cb func(val unsafe.Pointer, rb *ResourceBank) error) error

ReadFile reads from an AVRO file. The records in the file are decoded into structs of the type indicated by out. These are fed back to the application via the cb callback. ReadFile calls cb with a pointer to the struct. The pointer is converted to an unsafe.Pointer. The pointer should not be retained by the application past the return of cb.

 var records []myrecord
 if err := ReadFile(f, myrecord{}, func(val unsafe.Pointer) error {
     records = append(records, *(*record)(val))
     return nil
 }); err != nil {
	    return err
 }

func Register ¶

func Register(typ reflect.Type, f CodecBuildFunc)

Register is used to set a custom codec builder for a type

Types ¶

type BoolCodec ¶

type BoolCodec struct{}

func (BoolCodec) New ¶

func (BoolCodec) New(r *Buffer) unsafe.Pointer

func (BoolCodec) Read ¶

func (BoolCodec) Read(r *Buffer, p unsafe.Pointer) error

func (BoolCodec) Skip ¶

func (BoolCodec) Skip(r *Buffer) error

type Buffer ¶ added in v0.0.3

type Buffer struct {
	// contains filtered or unexported fields
}

Buffer is a very simple replacement for bytes.Reader that avoids data copies

func NewBuffer ¶ added in v0.0.3

func NewBuffer(data []byte) *Buffer

NewBuffer returns a new Buffer.

func (*Buffer) Alloc ¶ added in v0.0.6

func (d *Buffer) Alloc(rtyp reflect.Type) unsafe.Pointer

Alloc allocates a pointer to the type rtyp. The data is allocated in a ResourceBank

func (*Buffer) ExtractResourceBank ¶ added in v0.0.14

func (d *Buffer) ExtractResourceBank() *ResourceBank

ExtractResourceBank extracts the current ResourceBank from the buffer, and replaces it with a fresh one.

func (*Buffer) Len ¶ added in v0.0.3

func (d *Buffer) Len() int

Len returns the length of unread data in the buffer

func (*Buffer) Next ¶ added in v0.0.3

func (d *Buffer) Next(l int) ([]byte, error)

Next returns the next l bytes from the buffer. It does so without copying, so if you hold onto the data you risk holding onto a lot of data. If l exceeds the remaining space Next returns io.EOF

func (*Buffer) NextAsString ¶ added in v0.0.6

func (d *Buffer) NextAsString(l int) (string, error)

NextAsString returns the next l bytes from the buffer as a string. The string data is held in a StringBank and will be valid only until someone calls Close on that bank. If l exceeds the remaining space NextAsString returns io.EOF

func (*Buffer) ReadByte ¶ added in v0.0.3

func (d *Buffer) ReadByte() (byte, error)

ReadByte returns the next byte from the buffer. If no bytes are left it returns io.EOF

func (*Buffer) Reset ¶ added in v0.0.3

func (d *Buffer) Reset(data []byte)

Reset allows you to reuse a buffer with a new set of data

func (*Buffer) Varint ¶ added in v0.0.4

func (d *Buffer) Varint() (int64, error)

Varint reads a varint from the buffer

type BytesCodec ¶

type BytesCodec struct{}

func (BytesCodec) New ¶

func (BytesCodec) New(r *Buffer) unsafe.Pointer

func (BytesCodec) Read ¶

func (BytesCodec) Read(r *Buffer, ptr unsafe.Pointer) error

func (BytesCodec) Skip ¶

func (BytesCodec) Skip(r *Buffer) error

type Codec ¶

type Codec interface {
	// Read reads the wire format bytes for the current field from r and sets up
	// the value that p points to. The codec can assume that the memory for an
	// instance of the type for which the codec is registered is present behind
	// p
	Read(r *Buffer, p unsafe.Pointer) error
	// Skip advances the reader over the bytes for the current field.
	Skip(r *Buffer) error
	// New creates a pointer to the type for which the codec is registered. It is
	// used if the enclosing record has a field that is a pointer to this type
	New(r *Buffer) unsafe.Pointer
}

Codec defines a decoder for a type. It may eventually define an encoder too. You can write custom Codecs for types. See Register and CodecBuildFunc

type CodecBuildFunc ¶

type CodecBuildFunc func(schema Schema, typ reflect.Type) (Codec, error)

CodecBuildFunc is the function signature for a codec builder. If you want to customise AVRO decoding for a type register a CodecBuildFunc via the Register call. Schema is the AVRO schema for the type to build. typ should match the type the function was registered under.

type Compression ¶ added in v0.0.18

type Compression string

const (
	CompressionNull    Compression = "null"
	CompressionDeflate Compression = "deflate"
	CompressionSnappy  Compression = "snappy"
)

type DoubleCodec ¶

type DoubleCodec = floatCodec[float64]

type FileHeader ¶

type FileHeader struct {
	Magic [4]byte           `json:"magic"`
	Meta  map[string][]byte `json:"meta"`
	Sync  [16]byte          `json:"sync"`
}

FileHeader represents an AVRO file header

type FileWriter ¶ added in v0.0.18

type FileWriter struct {
	// contains filtered or unexported fields
}

FileWriter provides limited support for writing AVRO files. It allows you to write blocks of already encoded data. Actually encoding data as AVRO is not yet supported.

func NewFileWriter ¶ added in v0.0.18

func NewFileWriter(schema []byte, compression Compression) (*FileWriter, error)

NewFileWriter creates a new FileWriter. The schema is the JSON encoded schema. The compression parameter indicates the compression codec to use.

func (*FileWriter) AppendHeader ¶ added in v0.0.18

func (f *FileWriter) AppendHeader(buf []byte) []byte

AppendHeader appends the AVRO file header to the provided buffer.

func (*FileWriter) WriteBlock ¶ added in v0.0.18

func (f *FileWriter) WriteBlock(w io.Writer, rowCount int, block []byte) error

WriteBlock writes a block of data to the writer. The block must be rowCount rows of AVRO encoded data.

func (*FileWriter) WriteHeader ¶ added in v0.0.18

func (f *FileWriter) WriteHeader(w io.Writer) error

WriteHeader writes the AVRO file header to the writer.

type Float32DoubleCodec ¶

type Float32DoubleCodec struct {
	DoubleCodec
}

func (Float32DoubleCodec) New ¶

func (Float32DoubleCodec) New(r *Buffer) unsafe.Pointer

func (Float32DoubleCodec) Read ¶

func (c Float32DoubleCodec) Read(r *Buffer, p unsafe.Pointer) error

type FloatCodec ¶

type FloatCodec = floatCodec[float32]

type Int16Codec ¶

type Int16Codec = IntCodec[int16]

type Int32Codec ¶

type Int32Codec = IntCodec[int32]

type Int64Codec ¶

type Int64Codec = IntCodec[int64]

type IntCodec ¶ added in v0.0.13

type IntCodec[T int64 | int32 | int16] struct{}

Int64Codec is an avro codec for int64

func (IntCodec[T]) New ¶ added in v0.0.13

func (IntCodec[T]) New(r *Buffer) unsafe.Pointer

New creates a pointer to a new int64

func (IntCodec[T]) Read ¶ added in v0.0.13

func (IntCodec[T]) Read(r *Buffer, p unsafe.Pointer) error

func (IntCodec[T]) Skip ¶ added in v0.0.13

func (IntCodec[T]) Skip(r *Buffer) error

Skip skips over an int

type MapCodec ¶

type MapCodec struct {
	// contains filtered or unexported fields
}

MapCodec is a decoder for map types. The key must always be string

func (*MapCodec) New ¶

func (m *MapCodec) New(r *Buffer) unsafe.Pointer

func (*MapCodec) Read ¶

func (m *MapCodec) Read(r *Buffer, p unsafe.Pointer) error

func (*MapCodec) Skip ¶

func (m *MapCodec) Skip(r *Buffer) error

type PointerCodec ¶ added in v0.0.6

type PointerCodec struct {
	Codec
}

func (*PointerCodec) New ¶ added in v0.0.6

func (c *PointerCodec) New(r *Buffer) unsafe.Pointer

func (*PointerCodec) Read ¶ added in v0.0.6

func (c *PointerCodec) Read(r *Buffer, p unsafe.Pointer) error

type Reader ¶

type Reader interface {
	io.Reader
	io.ByteReader
}

Reader combines io.ByteReader and io.Reader. It's what we need to read

type ResourceBank ¶ added in v0.0.6

type ResourceBank struct {
	// contains filtered or unexported fields
}

ResourceBank is used to allocate memory used to create structs to decode AVRO into. The primary reason for having it is to allow the user to flag the memory can be re-used, so reducing the strain on the GC

We allocate using the required type of thing so the GC can still inspect within the memory.

func (*ResourceBank) Alloc ¶ added in v0.0.6

func (rb *ResourceBank) Alloc(rtyp reflect.Type) unsafe.Pointer

Alloc reserves some memory in the ResourceBank. Note that this memory may be re-used after Close is called.

func (*ResourceBank) Close ¶ added in v0.0.6

func (rb *ResourceBank) Close()

Close marks the resources in the ResourceBank as available for re-use

func (*ResourceBank) ToString ¶ added in v0.0.6

func (rb *ResourceBank) ToString(in []byte) string

ToString saves string data in the bank and returns a string. The string is valid until someone calls Close

type Schema ¶

type Schema struct {
	Type   string
	Object *SchemaObject
	Union  []Schema
}

Schema is a representation of AVRO schema JSON. Primitive types populate Type only. UnionTypes populate Type and Union fields. All other types populate Type and a subset of Object fields.

func FileSchema ¶ added in v0.0.11

func FileSchema(filename string) (Schema, error)

FileSchema reads the Schema from an AVRO file.

func SchemaFromString ¶ added in v0.0.14

func SchemaFromString(in string) (Schema, error)

func (Schema) Codec ¶ added in v0.0.14

func (s Schema) Codec(out interface{}) (Codec, error)

Codec creates a codec for the given schema and output type

type SchemaObject ¶

type SchemaObject struct {
	Type        string `json:"type"`
	LogicalType string `json:"logicalType,omitempty"`
	Name        string `json:"name,omitempty"`
	Namespace   string `json:"namespace,omitempty"`
	// Fields in a record
	Fields []SchemaRecordField `json:"fields,omitempty"`
	// The type of each item in an array
	Items Schema `json:"items,omitempty"`
	// The value types of a map (keys are strings)
	Values Schema `json:"values,omitempty"`
	// The size of a fixed type
	Size int `json:"size,omitempty"`
	// The values of an enum
	Symbols []string `json:"symbols,omitempty"`
}

SchemaObject contains all the fields of more complex schema types

type SchemaRecordField ¶

type SchemaRecordField struct {
	Name string `json:"name,omitempty"`
	Type Schema `json:"type,omitempty"`
}

SchemaRecordField represents one field of a Record schema

type StringCodec ¶

type StringCodec struct{}

StringCodec is a decoder for strings

func (StringCodec) New ¶

func (StringCodec) New(r *Buffer) unsafe.Pointer

func (StringCodec) Read ¶

func (StringCodec) Read(r *Buffer, ptr unsafe.Pointer) error

func (StringCodec) Skip ¶

func (StringCodec) Skip(r *Buffer) error

Directories ¶

Path	Synopsis
null Package null contains avro decoders for the types in github.com/unravelin/null.	Package null contains avro decoders for the types in github.com/unravelin/null.
time Package time contains avro decoders for time.Time.	Package time contains avro decoders for time.Time.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

avro

Documentation ¶

Overview ¶

Index ¶

Constants ¶

Variables ¶

Functions ¶

func ReadFile ¶

func Register ¶

Types ¶

type BoolCodec ¶

func (BoolCodec) New ¶

func (BoolCodec) Read ¶

func (BoolCodec) Skip ¶

type Buffer ¶ added in v0.0.3

func NewBuffer ¶ added in v0.0.3

func (*Buffer) Alloc ¶ added in v0.0.6

func (*Buffer) ExtractResourceBank ¶ added in v0.0.14

func (*Buffer) Len ¶ added in v0.0.3

func (*Buffer) Next ¶ added in v0.0.3

func (*Buffer) NextAsString ¶ added in v0.0.6

func (*Buffer) ReadByte ¶ added in v0.0.3

func (*Buffer) Reset ¶ added in v0.0.3

func (*Buffer) Varint ¶ added in v0.0.4

type BytesCodec ¶

func (BytesCodec) New ¶

func (BytesCodec) Read ¶

func (BytesCodec) Skip ¶

type Codec ¶

type CodecBuildFunc ¶

type Compression ¶ added in v0.0.18

type DoubleCodec ¶

type FileHeader ¶

type FileWriter ¶ added in v0.0.18

func NewFileWriter ¶ added in v0.0.18

func (*FileWriter) AppendHeader ¶ added in v0.0.18

func (*FileWriter) WriteBlock ¶ added in v0.0.18

func (*FileWriter) WriteHeader ¶ added in v0.0.18

type Float32DoubleCodec ¶

func (Float32DoubleCodec) New ¶

func (Float32DoubleCodec) Read ¶

type FloatCodec ¶

type Int16Codec ¶

type Int32Codec ¶

type Int64Codec ¶

type IntCodec ¶ added in v0.0.13

func (IntCodec[T]) New ¶ added in v0.0.13

func (IntCodec[T]) Read ¶ added in v0.0.13

func (IntCodec[T]) Skip ¶ added in v0.0.13

type MapCodec ¶

func (*MapCodec) New ¶

func (*MapCodec) Read ¶

func (*MapCodec) Skip ¶

type PointerCodec ¶ added in v0.0.6

func (*PointerCodec) New ¶ added in v0.0.6

func (*PointerCodec) Read ¶ added in v0.0.6

type Reader ¶

type ResourceBank ¶ added in v0.0.6

func (*ResourceBank) Alloc ¶ added in v0.0.6

func (*ResourceBank) Close ¶ added in v0.0.6

func (*ResourceBank) ToString ¶ added in v0.0.6

type Schema ¶

func FileSchema ¶ added in v0.0.11

func SchemaFromString ¶ added in v0.0.14

func (Schema) Codec ¶ added in v0.0.14

type SchemaObject ¶

type SchemaRecordField ¶

type StringCodec ¶

func (StringCodec) New ¶

func (StringCodec) Read ¶

func (StringCodec) Skip ¶

Source Files ¶

Directories ¶