codec

package
v0.0.0-...-9121635 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 12, 2014 License: MIT, BSD-3-Clause Imports: 14 Imported by: 0

Documentation

Overview

High Performance, Feature-Rich Idiomatic Go encoding library for msgpack and binc .

Supported Serialization formats are:

To install:

go get github.com/ugorji/go/codec

The idiomatic Go support is as seen in other encoding packages in the standard library (ie json, xml, gob, etc).

Rich Feature Set includes:

  • Simple but extremely powerful and feature-rich API
  • Very High Performance. Our extensive benchmarks show us outperforming Gob, Json and Bson by 2-4X. This was achieved by taking extreme care on:
  • managing allocation
  • function frame size (important due to Go's use of split stacks),
  • reflection use (and by-passing reflection for common types)
  • recursion implications
  • zero-copy mode (encoding/decoding to byte slice without using temp buffers)
  • Correct. Care was taken to precisely handle corner cases like: overflows, nil maps and slices, nil value in stream, etc.
  • Efficient zero-copying into temporary byte buffers when encoding into or decoding from a byte slice.
  • Standard field renaming via tags
  • Encoding from any value (struct, slice, map, primitives, pointers, interface{}, etc)
  • Decoding into pointer to any non-nil typed value (struct, slice, map, int, float32, bool, string, reflect.Value, etc)
  • Supports extension functions to handle the encode/decode of custom types
  • Support Go 1.2 encoding.BinaryMarshaler/BinaryUnmarshaler
  • Schema-less decoding (decode into a pointer to a nil interface{} as opposed to a typed non-nil value). Includes Options to configure what specific map or slice type to use when decoding an encoded list or map into a nil interface{}
  • Provides a RPC Server and Client Codec for net/rpc communication protocol.
  • Msgpack Specific:
  • Provides extension functions to handle spec-defined extensions (binary, timestamp)
  • Options to resolve ambiguities in handling raw bytes (as string or []byte) during schema-less decoding (decoding into a nil interface{})
  • RPC Server/Client Codec for msgpack-rpc protocol defined at: https://github.com/msgpack-rpc/msgpack-rpc/blob/master/spec.md
  • Fast Paths for some container types: For some container types, we circumvent reflection and its associated overhead and allocation costs, and encode/decode directly. These types are: Slice of all builtin types and interface{}, map of all builtin types and interface{} to string, interface{}, int, int64, uint64 symetrical maps of all builtin types and interface{}

Extension Support

Users can register a function to handle the encoding or decoding of their custom types.

There are no restrictions on what the custom type can be. Some examples:

type BisSet   []int
type BitSet64 uint64
type UUID     string
type MyStructWithUnexportedFields struct { a int; b bool; c []int; }
type GifImage struct { ... }

As an illustration, MyStructWithUnexportedFields would normally be encoded as an empty map because it has no exported fields, while UUID would be encoded as a string. However, with extension support, you can encode any of these however you like.

RPC

RPC Client and Server Codecs are implemented, so the codecs can be used with the standard net/rpc package.

Usage

Typical usage model:

// create and configure Handle
var (
  bh codec.BincHandle
  mh codec.MsgpackHandle
)

mh.MapType = reflect.TypeOf(map[string]interface{}(nil))

// configure extensions
// e.g. for msgpack, define functions and enable Time support for tag 1
// mh.AddExt(reflect.TypeOf(time.Time{}), 1, myMsgpackTimeEncodeExtFn, myMsgpackTimeDecodeExtFn)

// create and use decoder/encoder
var (
  r io.Reader
  w io.Writer
  b []byte
  h = &bh // or mh to use msgpack
)

dec = codec.NewDecoder(r, h)
dec = codec.NewDecoderBytes(b, h)
err = dec.Decode(&v)

enc = codec.NewEncoder(w, h)
enc = codec.NewEncoderBytes(&b, h)
err = enc.Encode(v)

//RPC Server
go func() {
    for {
        conn, err := listener.Accept()
        rpcCodec := codec.GoRpc.ServerCodec(conn, h)
        //OR rpcCodec := codec.MsgpackSpecRpc.ServerCodec(conn, h)
        rpc.ServeCodec(rpcCodec)
    }
}()

//RPC Communication (client side)
conn, err = net.Dial("tcp", "localhost:5555")
rpcCodec := codec.GoRpc.ClientCodec(conn, h)
//OR rpcCodec := codec.MsgpackSpecRpc.ClientCodec(conn, h)
client := rpc.NewClientWithCodec(rpcCodec)

Representative Benchmark Results

Run the benchmark suite using:

go test -bi -bench=. -benchmem

To run full benchmark suite (including against vmsgpack and bson), see notes in ext_dep_test.go

MSGPACK

Msgpack-c implementation powers the c, c++, python, ruby, etc libraries. We need to maintain compatibility with it and how it encodes integer values without caring about the type.

For compatibility with behaviour of msgpack-c reference implementation:

  • Go intX (>0) and uintX IS ENCODED AS msgpack +ve fixnum, unsigned
  • Go intX (<0) IS ENCODED AS msgpack -ve fixnum, signed

Index

Constants

View Source
const (
	// AsSymbolDefault is default.
	// Currently, this means only encode struct field names as symbols.
	// The default is subject to change.
	AsSymbolDefault AsSymbolFlag = iota

	// AsSymbolAll means encode anything which could be a symbol as a symbol.
	AsSymbolAll = 0xfe

	// AsSymbolNone means do not encode anything as a symbol.
	AsSymbolNone = 1 << iota

	// AsSymbolMapStringKeys means encode keys in map[string]XXX as symbols.
	AsSymbolMapStringKeysFlag

	// AsSymbolStructFieldName means encode struct field names as symbols.
	AsSymbolStructFieldNameFlag
)

Variables

View Source
var GoRpc goRpc

GoRpc implements Rpc using the communication protocol defined in net/rpc package. Its methods (ServerCodec and ClientCodec) return values that implement RpcCodecBuffered.

View Source
var MsgpackSpecRpc msgpackSpecRpc

MsgpackSpecRpc implements Rpc using the communication protocol defined in the msgpack spec at https://github.com/msgpack-rpc/msgpack-rpc/blob/master/spec.md . Its methods (ServerCodec and ClientCodec) return values that implement RpcCodecBuffered.

Functions

This section is empty.

Types

type AsSymbolFlag

type AsSymbolFlag uint8

AsSymbolFlag defines what should be encoded as symbols.

type BasicHandle

type BasicHandle struct {
	EncodeOptions
	DecodeOptions
	// contains filtered or unexported fields
}

WARNING: DO NOT USE DIRECTLY. EXPORTED FOR GODOC BENEFIT. WILL BE REMOVED.

BasicHandle encapsulates the common options and extension functions.

func (*BasicHandle) AddExt

func (o *BasicHandle) AddExt(
	rt reflect.Type,
	tag byte,
	encfn func(reflect.Value) ([]byte, error),
	decfn func(reflect.Value, []byte) error,
) (err error)

AddExt registers an encode and decode function for a reflect.Type. Note that the type must be a named type, and specifically not a pointer or Interface. An error is returned if that is not honored.

To Deregister an ext, call AddExt with 0 tag, nil encfn and nil decfn.

type BincHandle

type BincHandle struct {
	BasicHandle
}

BincHandle is a Handle for the Binc Schema-Free Encoding Format defined at https://github.com/ugorji/binc .

BincHandle currently supports all Binc features with the following EXCEPTIONS:

  • only integers up to 64 bits of precision are supported. big integers are unsupported.
  • Only IEEE 754 binary32 and binary64 floats are supported (ie Go float32 and float64 types). extended precision and decimal IEEE 754 floats are unsupported.
  • Only UTF-8 strings supported. Unicode_Other Binc types (UTF16, UTF32) are currently unsupported.

Note that these EXCEPTIONS are temporary and full support is possible and may happen soon.

func (*BincHandle) AddExt

func (o *BincHandle) AddExt(
	rt reflect.Type,
	tag byte,
	encfn func(reflect.Value) ([]byte, error),
	decfn func(reflect.Value, []byte) error,
) (err error)

AddExt registers an encode and decode function for a reflect.Type. Note that the type must be a named type, and specifically not a pointer or Interface. An error is returned if that is not honored.

To Deregister an ext, call AddExt with 0 tag, nil encfn and nil decfn.

type DecodeOptions

type DecodeOptions struct {
	// An instance of MapType is used during schema-less decoding of a map in the stream.
	// If nil, we use map[interface{}]interface{}
	MapType reflect.Type
	// An instance of SliceType is used during schema-less decoding of an array in the stream.
	// If nil, we use []interface{}
	SliceType reflect.Type
	// ErrorIfNoField controls whether an error is returned when decoding a map
	// from a codec stream into a struct, and no matching struct field is found.
	ErrorIfNoField bool
}

type Decoder

type Decoder struct {
	// contains filtered or unexported fields
}

A Decoder reads and decodes an object from an input stream in the codec format.

func NewDecoder

func NewDecoder(r io.Reader, h Handle) *Decoder

NewDecoder returns a Decoder for decoding a stream of bytes from an io.Reader.

For efficiency, Users are encouraged to pass in a memory buffered writer (eg bufio.Reader, bytes.Buffer).

func NewDecoderBytes

func NewDecoderBytes(in []byte, h Handle) *Decoder

NewDecoderBytes returns a Decoder which efficiently decodes directly from a byte slice with zero copying.

func (*Decoder) Decode

func (d *Decoder) Decode(v interface{}) (err error)

Decode decodes the stream from reader and stores the result in the value pointed to by v. v cannot be a nil pointer. v can also be a reflect.Value of a pointer.

Note that a pointer to a nil interface is not a nil pointer. If you do not know what type of stream it is, pass in a pointer to a nil interface. We will decode and store a value in that nil interface.

Sample usages:

// Decoding into a non-nil typed value
var f float32
err = codec.NewDecoder(r, handle).Decode(&f)

// Decoding into nil interface
var v interface{}
dec := codec.NewDecoder(r, handle)
err = dec.Decode(&v)

When decoding into a nil interface{}, we will decode into an appropriate value based on the contents of the stream:

  • Numbers are decoded as float64, int64 or uint64.
  • Other values are decoded appropriately depending on the type: bool, string, []byte, time.Time, etc
  • Extensions are decoded as RawExt (if no ext function registered for the tag)

Configurations exist on the Handle to override defaults (e.g. for MapType, SliceType and how to decode raw bytes).

When decoding into a non-nil interface{} value, the mode of encoding is based on the type of the value. When a value is seen:

  • If an extension is registered for it, call that extension function
  • If it implements BinaryUnmarshaler, call its UnmarshalBinary(data []byte) error
  • Else decode it based on its reflect.Kind

There are some special rules when decoding into containers (slice/array/map/struct). Decode will typically use the stream contents to UPDATE the container.

  • A map can be decoded from a stream map, by updating matching keys.
  • A slice can be decoded from a stream array, by updating the first n elements, where n is length of the stream.
  • A slice can be decoded from a stream map, by decoding as if it contains a sequence of key-value pairs.
  • A struct can be decoded from a stream map, by updating matching fields.
  • A struct can be decoded from a stream array, by updating fields as they occur in the struct (by index).

When decoding a stream map or array with length of 0 into a nil map or slice, we reset the destination map or slice to a zero-length value.

However, when decoding a stream nil, we reset the destination container to its "zero" value (e.g. nil for slice/map, etc).

type EncodeOptions

type EncodeOptions struct {
	// Encode a struct as an array, and not as a map.
	StructToArray bool

	// AsSymbols defines what should be encoded as symbols.
	//
	// Encoding as symbols can reduce the encoded size significantly.
	//
	// However, during decoding, each string to be encoded as a symbol must
	// be checked to see if it has been seen before. Consequently, encoding time
	// will increase if using symbols, because string comparisons has a clear cost.
	//
	// Sample values:
	//   AsSymbolNone
	//   AsSymbolAll
	//   AsSymbolMapStringKeys
	//   AsSymbolMapStringKeysFlag | AsSymbolStructFieldNameFlag
	AsSymbols AsSymbolFlag
}

type Encoder

type Encoder struct {
	// contains filtered or unexported fields
}

An Encoder writes an object to an output stream in the codec format.

func NewEncoder

func NewEncoder(w io.Writer, h Handle) *Encoder

NewEncoder returns an Encoder for encoding into an io.Writer.

For efficiency, Users are encouraged to pass in a memory buffered writer (eg bufio.Writer, bytes.Buffer).

func NewEncoderBytes

func NewEncoderBytes(out *[]byte, h Handle) *Encoder

NewEncoderBytes returns an encoder for encoding directly and efficiently into a byte slice, using zero-copying to temporary slices.

It will potentially replace the output byte slice pointed to. After encoding, the out parameter contains the encoded contents.

func (*Encoder) Encode

func (e *Encoder) Encode(v interface{}) (err error)

Encode writes an object into a stream in the codec format.

Encoding can be configured via the "codec" struct tag for the fields.

The "codec" key in struct field's tag value is the key name, followed by an optional comma and options.

To set an option on all fields (e.g. omitempty on all fields), you can create a field called _struct, and set flags on it.

Struct values "usually" encode as maps. Each exported struct field is encoded unless:

  • the field's codec tag is "-", OR
  • the field is empty and its codec tag specifies the "omitempty" option.

When encoding as a map, the first string in the tag (before the comma) is the map key string to use when encoding.

However, struct values may encode as arrays. This happens when:

  • StructToArray Encode option is set, OR
  • the codec tag on the _struct field sets the "toarray" option

Values with types that implement MapBySlice are encoded as stream maps.

The empty values (for omitempty option) are false, 0, any nil pointer or interface value, and any array, slice, map, or string of length zero.

Anonymous fields are encoded inline if no struct tag is present. Else they are encoded as regular fields.

Examples:

type MyStruct struct {
    _struct bool    `codec:",omitempty"`   //set omitempty for every field
    Field1 string   `codec:"-"`            //skip this field
    Field2 int      `codec:"myName"`       //Use key "myName" in encode stream
    Field3 int32    `codec:",omitempty"`   //use key "Field3". Omit if empty.
    Field4 bool     `codec:"f4,omitempty"` //use key "f4". Omit if empty.
    ...
}

type MyStruct struct {
    _struct bool    `codec:",omitempty,toarray"`   //set omitempty for every field
                                                   //and encode struct as an array
}

The mode of encoding is based on the type of the value. When a value is seen:

  • If an extension is registered for it, call that extension function
  • If it implements BinaryMarshaler, call its MarshalBinary() (data []byte, err error)
  • Else encode it based on its reflect.Kind

Note that struct field names and keys in map[string]XXX will be treated as symbols. Some formats support symbols (e.g. binc) and will properly encode the string only once in the stream, and use a tag to refer to it thereafter.

type Handle

type Handle interface {
	// contains filtered or unexported methods
}

Handle is the interface for a specific encoding format.

Typically, a Handle is pre-configured before first time use, and not modified while in use. Such a pre-configured Handle is safe for concurrent access.

type MapBySlice

type MapBySlice interface {
	MapBySlice()
}

MapBySlice represents a slice which should be encoded as a map in the stream. The slice contains a sequence of key-value pairs.

type MsgpackHandle

type MsgpackHandle struct {
	BasicHandle

	// RawToString controls how raw bytes are decoded into a nil interface{}.
	RawToString bool
	// WriteExt flag supports encoding configured extensions with extension tags.
	// It also controls whether other elements of the new spec are encoded (ie Str8).
	//
	// With WriteExt=false, configured extensions are serialized as raw bytes
	// and Str8 is not encoded.
	//
	// A stream can still be decoded into a typed value, provided an appropriate value
	// is provided, but the type cannot be inferred from the stream. If no appropriate
	// type is provided (e.g. decoding into a nil interface{}), you get back
	// a []byte or string based on the setting of RawToString.
	WriteExt bool
}

MsgpackHandle is a Handle for the Msgpack Schema-Free Encoding Format.

func (*MsgpackHandle) AddExt

func (o *MsgpackHandle) AddExt(
	rt reflect.Type,
	tag byte,
	encfn func(reflect.Value) ([]byte, error),
	decfn func(reflect.Value, []byte) error,
) (err error)

AddExt registers an encode and decode function for a reflect.Type. Note that the type must be a named type, and specifically not a pointer or Interface. An error is returned if that is not honored.

To Deregister an ext, call AddExt with 0 tag, nil encfn and nil decfn.

type MsgpackSpecRpcMultiArgs

type MsgpackSpecRpcMultiArgs []interface{}

MsgpackSpecRpcMultiArgs is a special type which signifies to the MsgpackSpecRpcCodec that the backend RPC service takes multiple arguments, which have been arranged in sequence in the slice.

The Codec then passes it AS-IS to the rpc service (without wrapping it in an array of 1 element).

type RawExt

type RawExt struct {
	Tag  byte
	Data []byte
}

RawExt represents raw unprocessed extension data.

type Rpc

type Rpc interface {
	ServerCodec(conn io.ReadWriteCloser, h Handle) rpc.ServerCodec
	ClientCodec(conn io.ReadWriteCloser, h Handle) rpc.ClientCodec
}

Rpc provides a rpc Server or Client Codec for rpc communication.

type RpcCodecBuffered

type RpcCodecBuffered interface {
	BufferedReader() *bufio.Reader
	BufferedWriter() *bufio.Writer
}

RpcCodecBuffered allows access to the underlying bufio.Reader/Writer used by the rpc connection. It accomodates use-cases where the connection should be used by rpc and non-rpc functions, e.g. streaming a file after sending an rpc response.

type SimpleHandle

type SimpleHandle struct {
	BasicHandle
}

SimpleHandle is a Handle for a very simple encoding format.

simple is a simplistic codec similar to binc, but not as compact.

  • Encoding of a value is always preceeded by the descriptor byte (bd)
  • True, false, nil are encoded fully in 1 byte (the descriptor)
  • Integers (intXXX, uintXXX) are encoded in 1, 2, 4 or 8 bytes (plus a descriptor byte). There are positive (uintXXX and intXXX >= 0) and negative (intXXX < 0) integers.
  • Floats are encoded in 4 or 8 bytes (plus a descriptor byte)
  • Lenght of containers (strings, bytes, array, map, extensions) are encoded in 0, 1, 2, 4 or 8 bytes. Zero-length containers have no length encoded. For others, the number of bytes is given by pow(2, bd%3)
  • maps are encoded as [bd] [length] [[key][value]]...
  • arrays are encoded as [bd] [length] [value]...
  • extensions are encoded as [bd] [length] [tag] [byte]...
  • strings/bytearrays are encoded as [bd] [length] [byte]...

The full spec will be published soon.

func (*SimpleHandle) AddExt

func (o *SimpleHandle) AddExt(
	rt reflect.Type,
	tag byte,
	encfn func(reflect.Value) ([]byte, error),
	decfn func(reflect.Value, []byte) error,
) (err error)

AddExt registers an encode and decode function for a reflect.Type. Note that the type must be a named type, and specifically not a pointer or Interface. An error is returned if that is not honored.

To Deregister an ext, call AddExt with 0 tag, nil encfn and nil decfn.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL