Documentation ¶
Overview ¶
Package goavro is a library that encodes and decodes of Avro data. It provides an interface to encode data directly to io.Writer streams, and to decode data from io.Reader streams. Goavro fully adheres to version 1.7.7 of the Avro specification and data encoding.
Index ¶
- Constants
- Variables
- func IsCompressionCodecSupported(someCodec string) bool
- type ByteWriter
- type Codec
- type CodecSetter
- type Datum
- type Decoder
- type Encoder
- type Enum
- type ErrCodecBuild
- type ErrDecoder
- type ErrEncoder
- type ErrInvalidName
- type ErrNoSuchField
- type ErrNotRecord
- type ErrReader
- type ErrReaderBlockCount
- type ErrReaderInit
- type ErrSchemaParse
- type ErrWriterInit
- type Fixed
- type Reader
- type ReaderSetter
- type Record
- func (r Record) Get(fieldName string) (interface{}, error)
- func (r Record) GetFieldSchema(fieldName string) (interface{}, error)
- func (r Record) GetQualified(qualifiedName string) (interface{}, error)
- func (r Record) Set(fieldName string, value interface{}) error
- func (r Record) SetQualified(qualifiedName string, value interface{}) error
- func (r Record) String() string
- type RecordCache
- type RecordSetter
- type StringWriter
- type Writer
- type WriterSetter
- func BlockSize(blockSize int64) WriterSetter
- func BlockTick(blockTick time.Duration) WriterSetter
- func BufferToWriter(w io.Writer) WriterSetter
- func Compression(someCompressionCodec string) WriterSetter
- func Sync(someSync []byte) WriterSetter
- func ToWriter(w io.Writer) WriterSetter
- func UseCodec(codec Codec) WriterSetter
- func WriterSchema(someSchema string) WriterSetter
Constants ¶
const ( CompressionNull = "null" CompressionDeflate = "deflate" CompressionSnappy = "snappy" )
Compression codecs that Reader and Writer instances can process.
const DefaultWriterBlockSize = 10
DefaultWriterBlockSize specifies the default number of datum items in a block when writing.
Variables ¶
var MaxDecodeSize = int64(math.MaxInt32)
MaxDecodeSize defines the maximum length a String or Bytes field. This is here because the way we decode Strings and Bytes fields is entirely stateless. This means we can't follow the example set by other encoders who look at how much data is left to be decoded and return an error if that amount is exceeded.
If you need to decode Avro data which contains strings or bytes fields longer than ~2.2GB, modify this value at your discretion.
On a 32bit platform this value should not exceed math.MaxInt32, as Go's make function is limited to only creating MaxInt number of objects at a time. On a 64bit platform the limitation is primarily your avaialble memory.
Example:
func init() { goavro.MaxDecodeSize = (1 << 40) // 1 TB of runes or bytes }
Functions ¶
func IsCompressionCodecSupported ¶ added in v1.0.0
IsCompressionCodecSupported returns true if and only if the specified codec string is supported by this library.
Types ¶
type ByteWriter ¶ added in v1.0.0
ByteWriter is the interface implemented by any object that bytes can be written to.
type Codec ¶
type Codec interface { Decoder Encoder Schema() string NewWriter(...WriterSetter) (*Writer, error) }
The Codec interface supports both Decode and Encode operations.
func NewCodec ¶
func NewCodec(someJSONSchema string, setters ...CodecSetter) (Codec, error)
NewCodec creates a new object that supports both the Decode and Encode methods. It requires an Avro schema, expressed as a JSON string.
codec, err := goavro.NewCodec(someJSONSchema) if err != nil { return nil, err } // Decoding data uses codec created above, and an io.Reader, // definition not shown: datum, err := codec.Decode(r) if err != nil { return nil, err } // Encoding data uses codec created above, an io.Writer, // definition not shown, and some data: err := codec.Encode(w, datum) if err != nil { return nil, err } // Encoding data using bufio.Writer to buffer the writes // during data encoding: func encodeWithBufferedWriter(c Codec, w io.Writer, datum interface{}) error { bw := bufio.NewWriter(w) err := c.Encode(bw, datum) if err != nil { return err } return bw.Flush() } err := encodeWithBufferedWriter(codec, w, datum) if err != nil { return nil, err }
type CodecSetter ¶ added in v1.0.0
CodecSetter functions are those those which are used to modify a newly instantiated Codec.
type Datum ¶ added in v1.0.0
type Datum struct { Value interface{} Err error }
Datum binds together a piece of data and any error resulting from either reading or writing that datum.
type Enum ¶ added in v1.0.0
type Enum struct {
Name, Value string
}
Enum is an abstract data type used to hold data corresponding to an Avro enum. Whenever an Avro schema specifies an enum, this library's Decode method will return an Enum initialized to the enum's name and value read from the io.Reader. Likewise, when using Encode to convert data to an Avro record, it is necessary to create and send an Enum instance to the Encode method.
type ErrCodecBuild ¶ added in v1.0.0
ErrCodecBuild is returned when the encoder encounters an error.
func (ErrCodecBuild) Error ¶ added in v1.0.0
func (e ErrCodecBuild) Error() string
type ErrDecoder ¶ added in v1.0.0
ErrDecoder is returned when the encoder encounters an error.
func (ErrDecoder) Error ¶ added in v1.0.0
func (e ErrDecoder) Error() string
type ErrEncoder ¶ added in v1.0.0
ErrEncoder is returned when the encoder encounters an error.
func (ErrEncoder) Error ¶ added in v1.0.0
func (e ErrEncoder) Error() string
type ErrInvalidName ¶
type ErrInvalidName struct {
Message string
}
ErrInvalidName is returned when a Codec cannot be created due to invalid name format.
func (ErrInvalidName) Error ¶
func (e ErrInvalidName) Error() string
type ErrNoSuchField ¶ added in v1.0.0
type ErrNoSuchField struct {
// contains filtered or unexported fields
}
ErrNoSuchField is returned when attempt to Get a field that does not exist in a Record.
func (ErrNoSuchField) Error ¶ added in v1.0.0
func (e ErrNoSuchField) Error() string
Error returns the string representation of an ErrNoSuchField error.
type ErrNotRecord ¶ added in v1.0.0
type ErrNotRecord struct {
// contains filtered or unexported fields
}
ErrNotRecord is returned when codepath expecs goavro.Record, but found something else.
func (ErrNotRecord) Error ¶ added in v1.0.0
func (e ErrNotRecord) Error() string
Error returns a string representation of the ErrNotRecord error.
type ErrReaderBlockCount ¶ added in v1.0.0
type ErrReaderBlockCount struct {
Err error
}
ErrReaderBlockCount is returned when a reader detects an error while attempting to read the block count and block size.
func (*ErrReaderBlockCount) Error ¶ added in v1.0.0
func (e *ErrReaderBlockCount) Error() string
type ErrReaderInit ¶ added in v1.0.0
ErrReaderInit is returned when the encoder encounters an error.
func (ErrReaderInit) Error ¶ added in v1.0.0
func (e ErrReaderInit) Error() string
type ErrSchemaParse ¶ added in v1.0.0
ErrSchemaParse is returned when a Codec cannot be created due to an error while reading or parsing the schema.
func (ErrSchemaParse) Error ¶ added in v1.0.0
func (e ErrSchemaParse) Error() string
type ErrWriterInit ¶ added in v1.0.0
ErrWriterInit is returned when an error is created during Writer initialization.
func (*ErrWriterInit) Error ¶ added in v1.0.0
func (e *ErrWriterInit) Error() string
Error converts the error instance to a string.
type Fixed ¶ added in v1.0.0
Fixed is an abstract data type used to hold data corresponding to an Avro 'Fixed' type. Whenever an Avro schema specifies a "Fixed" type, this library's Decode method will return a Fixed value initialized to the Fixed name, and value read from the io.Reader. Likewise, when using Encode to convert data to an Avro record, it is necessary to create and send a Fixed instance to the Encode method.
type Reader ¶ added in v1.0.0
type Reader struct { CompressionCodec string DataSchema string Sync []byte // contains filtered or unexported fields }
Reader structure contains data necessary to read Avro files.
func NewReader ¶ added in v1.0.0
func NewReader(setters ...ReaderSetter) (*Reader, error)
NewReader returns a object to read data from an io.Reader using the Avro Object Container Files format.
func main() { conn, err := net.Dial("tcp", "127.0.0.1:8080") if err != nil { log.Fatal(err) } fr, err := goavro.NewReader(goavro.FromReader(conn)) if err != nil { log.Fatal("cannot create Reader: ", err) } defer func() { if err := fr.Close(); err != nil { log.Fatal(err) } }() for fr.Scan() { datum, err := fr.Read() if err != nil { log.Println("cannot read datum: ", err) continue } fmt.Println("RECORD: ", datum) } }
type ReaderSetter ¶ added in v1.0.0
ReaderSetter functions are those those which are used to instantiate a new Reader.
func BufferFromReader ¶ added in v1.0.0
func BufferFromReader(r io.Reader) ReaderSetter
BufferFromReader wraps the specified `io.Reader` using a `bufio.Reader` to read from a file.
func FromReader ¶ added in v1.0.0
func FromReader(r io.Reader) ReaderSetter
FromReader specifies the `io.Reader` to use when reading a file.
type Record ¶ added in v1.0.0
type Record struct { Name string Fields []*recordField // contains filtered or unexported fields }
Record is an abstract data type used to hold data corresponding to an Avro record. Wherever an Avro schema specifies a record, this library's Decode method will return a Record initialized to the record's values read from the io.Reader. Likewise, when using Encode to convert data to an Avro record, it is necessary to create and send a Record instance to the Encode method.
func NewRecord ¶ added in v1.0.0
func NewRecord(setters ...RecordSetter) (*Record, error)
NewRecord will create a Record instance corresponding to the specified schema.
func recordExample(codec goavro.Codec, w io.Writer, recordSchema string) error { // To encode a Record, you need to instantiate a Record instance // that adheres to the schema the Encoder expect. someRecord, err := goavro.NewRecord(goavro.RecordSchema(recordSchema)) if err != nil { return err } // Once you have a Record, you can set the values of the various fields. someRecord.Set("username", "Aquaman") someRecord.Set("comment", "The Atlantic is oddly cold this morning!") // Feel free to fully qualify the field name if you'd like someRecord.Set("com.example.timestamp", int64(1082196484)) // Once the fields of the Record have the correct data, you can encode it err = codec.Encode(w, someRecord) return err }
func (Record) GetFieldSchema ¶ added in v1.0.0
GetFieldSchema returns the schema of the specified Record field.
func (Record) GetQualified ¶ added in v1.0.0
GetQualified returns the datum of the specified Record field, without attempting to qualify the name
func (Record) SetQualified ¶ added in v1.0.0
SetQualified updates the datum of the specified Record field, without attempting to qualify the name
type RecordCache ¶ added in v1.0.0
type RecordCache struct {
// contains filtered or unexported fields
}
RecordCache provides simplified way of getting a value from a nested field, while memoizing intermediate results in a cache for future lookups.
func NewRecordCache ¶ added in v1.0.0
func NewRecordCache(record *Record, delim byte) (*RecordCache, error)
NewRecordCache returns a new RecordCache structure used to get values from nested fields.
func example(codec Codec, someReader io.Reader) (string, error) { decoded, err := codec.Decode(someReader) record, ok := decoded.(*Record) if !ok { return "", ErrNotRecord{decoded} } rc, err := NewRecordCache(record, '/') if err != nil { return "", err } account, err := rc.Get("com.example.user/com.example.account") if err != nil { return "", err } s, ok := account.(string) if !ok { return "", fmt.Errorf("expected: string; actual: %T", account) } return s, nil }
func (*RecordCache) Get ¶ added in v1.0.0
func (rc *RecordCache) Get(name string) (interface{}, error)
Get splits the specified name by the stored delimiter, and attempts to retrieve the nested value corresponding to the nested fields.
type RecordSetter ¶ added in v1.0.0
RecordSetter functions are those those which are used to instantiate a new Record.
func RecordEnclosingNamespace ¶ added in v1.0.0
func RecordEnclosingNamespace(someNamespace string) RecordSetter
RecordEnclosingNamespace specifies the enclosing namespace of the record to create. For instance, if the enclosing namespace is `com.example`, and the record name is `Foo`, then the full record name will be `com.example.Foo`.
func RecordPedantic ¶ added in v1.0.0
func RecordPedantic() RecordSetter
RecordPedantic specifies pedantic handling, and will cause NewRecord to signal an error if various harmless schema violations occur.
func RecordSchema ¶ added in v1.0.0
func RecordSchema(recordSchemaJSON string) RecordSetter
RecordSchema specifies the schema of the record to create. Schema must be a JSON string.
type StringWriter ¶ added in v1.0.0
StringWriter is the interface implemented by any object that strings can be written to.
type Writer ¶ added in v1.0.0
type Writer struct { CompressionCodec string Sync []byte // contains filtered or unexported fields }
Writer structure contains data necessary to write Avro files.
func NewWriter ¶ added in v1.0.0
func NewWriter(setters ...WriterSetter) (*Writer, error)
NewWriter returns a object to write data to an io.Writer using the Avro Object Container Files format.
func serveClient(conn net.Conn, codec goavro.Codec) { fw, err := goavro.NewWriter( goavro.BlockSize(100), // flush data every 100 items goavro.BlockTick(10 * time.Second), // but at least every 10 seconds goavro.Compression(goavro.CompressionSnappy), goavro.ToWriter(conn), goavro.UseCodec(codec)) if err != nil { log.Fatal("cannot create Writer: ", err) } defer fw.Close() // create a record that matches the schema we want to encode someRecord, err := goavro.NewRecord(goavro.RecordSchema(recordSchema)) if err != nil { log.Fatal(err) } // identify field name to set datum for someRecord.Set("username", "Aquaman") someRecord.Set("comment", "The Atlantic is oddly cold this morning!") // you can fully qualify the field name someRecord.Set("com.example.timestamp", int64(1082196484)) fw.Write(someRecord) // create another record someRecord, err = goavro.NewRecord(goavro.RecordSchema(recordSchema)) if err != nil { log.Fatal(err) } someRecord.Set("username", "Batman") someRecord.Set("comment", "Who are all of these crazies?") someRecord.Set("com.example.timestamp", int64(1427383430)) fw.Write(someRecord) }
type WriterSetter ¶ added in v1.0.0
WriterSetter functions are those those which are used to instantiate a new Writer.
func BlockSize ¶ added in v1.0.0
func BlockSize(blockSize int64) WriterSetter
BlockSize specifies the default number of data items to be grouped in a block, compressed, and written to the stream.
It is a valid use case to set both BlockTick and BlockSize. For example, if BlockTick is set to time.Minute and BlockSize is set to 20, but only 13 items are written to the Writer in a minute, those 13 items will be grouped in a block, compressed, and written to the stream without waiting for the addition 7 items to complete the BlockSize.
By default, BlockSize is set to DefaultWriterBlockSize.
func BlockTick ¶ added in v1.0.0
func BlockTick(blockTick time.Duration) WriterSetter
BlockTick specifies the duration of time between when the Writer will flush the blocks to the stream.
It is a valid use case to set both BlockTick and BlockSize. For example, if BlockTick is set to time.Minute and BlockSize is set to 20, but only 13 items are written to the Writer in a minute, those 13 items will be grouped in a block, compressed, and written to the stream without waiting for the addition 7 items to complete the BlockSize.
By default, BlockTick is set to 0 and is ignored. This causes the blocker to fill up its internal queue of data to BlockSize items before flushing them to the stream.
func BufferToWriter ¶ added in v1.0.0
func BufferToWriter(w io.Writer) WriterSetter
BufferToWriter specifies which io.Writer is the target of the Writer stream, and creates a bufio.Writer around that io.Writer. It is invalid to specify both BufferToWriter and ToWriter. Exactly one of these must be called for a given Writer initialization.
func Compression ¶ added in v1.0.0
func Compression(someCompressionCodec string) WriterSetter
Compression is used to set the compression codec of a new Writer instance.
func Sync ¶ added in v1.0.0
func Sync(someSync []byte) WriterSetter
Sync is used to set the sync marker bytes of a new instance. It checks to ensure the byte slice is 16 bytes long, but does not check that it has been set to something other than the zero value. Usually you can elide the `Sync` call and allow it to create a random byte sequence.
func ToWriter ¶ added in v1.0.0
func ToWriter(w io.Writer) WriterSetter
ToWriter specifies which io.Writer is the target of the Writer stream. It is invalid to specify both BufferToWriter and ToWriter. Exactly one of these must be called for a given Writer initialization.
func UseCodec ¶ added in v1.0.0
func UseCodec(codec Codec) WriterSetter
UseCodec specifies that a Writer should reuse an existing Codec rather than creating a new one, needlessly recompling the same schema.
func WriterSchema ¶ added in v1.0.0
func WriterSchema(someSchema string) WriterSetter
WriterSchema is used to set the Avro schema of a new instance. If a codec has already been compiled for the schema, it is faster to use the UseCodec method instead of WriterSchema.