Documentation
¶
Overview ¶
Package avro is an AVRO decoder aimed principly at decoding AVRO output from Google's Big Query. It decodes directly into Go structs, and uses json tags as naming hints.
The primary interface to the package is ReadFile. This reads an AVRO file, combining the schema in the file with type information from the struct passed via the out parameter to decode the records. It then passes an instance of a struct of type out to the callback cb for each record in the file.
You can implement custom decoders for your own types and register them via the Register function. github.com/phil/avro/null is an example of custom decoders for the types defined in github.com/unravelin/null
Index ¶
- func ReadFile(r Reader, out interface{}, cb func(val unsafe.Pointer, rb *ResourceBank) error) error
- func Register(typ reflect.Type, f CodecBuildFunc)
- type BoolCodec
- type Buffer
- func (d *Buffer) Alloc(rtyp reflect.Type) unsafe.Pointer
- func (d *Buffer) ExtractResourceBank() *ResourceBank
- func (d *Buffer) Len() int
- func (d *Buffer) Next(l int) ([]byte, error)
- func (d *Buffer) NextAsString(l int) (string, error)
- func (d *Buffer) ReadByte() (byte, error)
- func (d *Buffer) Reset(data []byte)
- func (d *Buffer) Varint() (int64, error)
- type BytesCodec
- type Codec
- type CodecBuildFunc
- type DoubleCodec
- type FileHeader
- type Float32DoubleCodec
- type FloatCodec
- type Int16Codec
- type Int32Codec
- type Int64Codec
- type IntCodec
- type MapCodec
- type PointerCodec
- type Reader
- type ResourceBank
- type Schema
- type SchemaObject
- type SchemaRecordField
- type StringCodec
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ReadFile ¶
ReadFile reads from an AVRO file. The records in the file are decoded into structs of the type indicated by out. These are fed back to the application via the cb callback. ReadFile calls cb with a pointer to the struct. The pointer is converted to an unsafe.Pointer. The pointer should not be retained by the application past the return of cb.
var records []myrecord if err := ReadFile(f, myrecord{}, func(val unsafe.Pointer) error { records = append(records, *(*record)(val)) return nil }); err != nil { return err }
func Register ¶
func Register(typ reflect.Type, f CodecBuildFunc)
Register is used to set a custom codec builder for a type
Types ¶
type Buffer ¶ added in v0.0.3
type Buffer struct {
// contains filtered or unexported fields
}
Buffer is a very simple replacement for bytes.Reader that avoids data copies
func (*Buffer) Alloc ¶ added in v0.0.6
Alloc allocates a pointer to the type rtyp. The data is allocated in a ResourceBank
func (*Buffer) ExtractResourceBank ¶ added in v0.0.14
func (d *Buffer) ExtractResourceBank() *ResourceBank
ExtractResourceBank extracts the current ResourceBank from the buffer, and replaces it with a fresh one.
func (*Buffer) Next ¶ added in v0.0.3
Next returns the next l bytes from the buffer. It does so without copying, so if you hold onto the data you risk holding onto a lot of data. If l exceeds the remaining space Next returns io.EOF
func (*Buffer) NextAsString ¶ added in v0.0.6
NextAsString returns the next l bytes from the buffer as a string. The string data is held in a StringBank and will be valid only until someone calls Close on that bank. If l exceeds the remaining space NextAsString returns io.EOF
func (*Buffer) ReadByte ¶ added in v0.0.3
ReadByte returns the next byte from the buffer. If no bytes are left it returns io.EOF
type BytesCodec ¶
type BytesCodec struct{}
func (BytesCodec) Skip ¶
func (BytesCodec) Skip(r *Buffer) error
type Codec ¶
type Codec interface { // Read reads the wire format bytes for the current field from r and sets up // the value that p points to. The codec can assume that the memory for an // instance of the type for which the codec is registered is present behind // p Read(r *Buffer, p unsafe.Pointer) error // Skip advances the reader over the bytes for the current field. Skip(r *Buffer) error // New creates a pointer to the type for which the codec is registered. It is // used if the enclosing record has a field that is a pointer to this type New(r *Buffer) unsafe.Pointer }
Codec defines a decoder for a type. It may eventually define an encoder too. You can write custom Codecs for types. See Register and CodecBuildFunc
type CodecBuildFunc ¶
CodecBuildFunc is the function signature for a codec builder. If you want to customise AVRO decoding for a type register a CodecBuildFunc via the Register call. Schema is the AVRO schema for the type to build. typ should match the type the function was registered under.
type DoubleCodec ¶
type DoubleCodec = floatCodec[float64]
type FileHeader ¶
type FileHeader struct { Magic [4]byte `json:"magic"` Meta map[string][]byte `json:"meta"` Sync [16]byte `json:"sync"` }
FileHeader represents an AVRO file header
type Float32DoubleCodec ¶
type Float32DoubleCodec struct {
DoubleCodec
}
type FloatCodec ¶
type FloatCodec = floatCodec[float32]
type Int16Codec ¶
type Int32Codec ¶
type Int64Codec ¶
type IntCodec ¶ added in v0.0.13
Int64Codec is an avro codec for int64
type MapCodec ¶
type MapCodec struct {
// contains filtered or unexported fields
}
MapCodec is a decoder for map types. The key must always be string
type PointerCodec ¶ added in v0.0.6
type PointerCodec struct {
Codec
}
type Reader ¶
type Reader interface { io.Reader io.ByteReader }
Reader combines io.ByteReader and io.Reader. It's what we need to read
type ResourceBank ¶ added in v0.0.6
type ResourceBank struct {
// contains filtered or unexported fields
}
ResourceBank is used to allocate memory used to create structs to decode AVRO into. The primary reason for having it is to allow the user to flag the memory can be re-used, so reducing the strain on the GC
We allocate using the required type of thing so the GC can still inspect within the memory.
func (*ResourceBank) Alloc ¶ added in v0.0.6
func (rb *ResourceBank) Alloc(rtyp reflect.Type) unsafe.Pointer
Alloc reserves some memory in the ResourceBank. Note that this memory may be re-used after Close is called.
func (*ResourceBank) Close ¶ added in v0.0.6
func (rb *ResourceBank) Close()
Close marks the resources in the ResourceBank as available for re-use
func (*ResourceBank) ToString ¶ added in v0.0.6
func (rb *ResourceBank) ToString(in []byte) string
ToString saves string data in the bank and returns a string. The string is valid until someone calls Close
type Schema ¶
type Schema struct { Type string Object *SchemaObject Union []Schema }
Schema is a representation of AVRO schema JSON. Primitive types populate Type only. UnionTypes populate Type and Union fields. All other types populate Type and a subset of Object fields.
func FileSchema ¶ added in v0.0.11
FileSchema reads the Schema from an AVRO file.
func SchemaFromString ¶ added in v0.0.14
type SchemaObject ¶
type SchemaObject struct { Type string `json:"type"` LogicalType string `json:"logicalType,omitempty"` Name string `json:"name,omitempty"` Namespace string `json:"namespace,omitempty"` // Fields in a record Fields []SchemaRecordField `json:"fields,omitempty"` // The type of each item in an array Items Schema `json:"items,omitempty"` // The value types of a map (keys are strings) Values Schema `json:"values,omitempty"` // The size of a fixed type Size int `json:"size,omitempty"` // The values of an enum Symbols []string `json:"symbols,omitempty"` }
SchemaObject contains all the fields of more complex schema types
type SchemaRecordField ¶
type SchemaRecordField struct { Name string `json:"name,omitempty"` Type Schema `json:"type,omitempty"` }
SchemaRecordField represents one field of a Record schema
type StringCodec ¶
type StringCodec struct{}
StringCodec is a decoder for strings
func (StringCodec) Skip ¶
func (StringCodec) Skip(r *Buffer) error