protein

package module
v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 18, 2017 License: Apache-2.0 Imports: 21 Imported by: 0

README

Protein

Protein Status Build Status Coverage Status GoDoc

Protein is an encoding/decoding library for Protobuf that comes with schema-versioning and runtime-decoding capabilities.

It has multiple use-cases, including but not limited to:

  • setting up schema registries
  • decoding Protobuf payloads without the need to know their schema at compile-time (all the while keeping strong-typing around!)
  • identifying & preventing applicative bugs and data corruption issues
  • creating custom-made container formats for on-disk storage
  • ...and more!

Protein is divided into 3 sub-components that all play a critical role in its operation:

  • The protoscan API walks through the symbols of the running executable to find every instanciated protobuf schemas, collect them, build the dependency trees that link them together and finally compute the deterministic versioning hashes that ultimately define them.
  • The protostruct API is capable of creating structure definitions at runtime using the dependency trees of protobuf schemas that were previously computed by the protoscan API.
  • The Transcoder class implements the high-level, user-facing encoding/decoding API that ties all of this together and offers the tools to cover the various use-cases cited above.

An upcoming blog post detailing the inner workings of these components is in the works and shall be available soon.

Have a look at the Quickstart section to get started.


IMPORTANT NOTE REGARDING VENDORING (I.E. IT WON'T COMPILE)

Protein makes use of Go's linkname feature in order to be able to sniff protobuf schemas off of memory.

The go:linkname directive instructs the compiler to declare a local symbol as an alias for an external one, whether it is public or private. This allows Protein to bind to some of the private methods of the official protobuf package for various reasons (see here and there for more information).

Unfortunately, vendoring modifies symbol names due to the way mangling works; e.g. a symbol called github.com/gogo/protobuf/protoc-gen-gogo/generator.(*Generator).goTag actually becomes github.com/myname/myproject/vendor/github.com/gogo/protobuf/protoc-gen-gogo/generator.(*Generator).goTag once the gogo/protobuf package gets vendored inside the myname/myproject package.

These modifications to the symbol names result in the dreaded relocation target <symbol-name> not defined error at compile time.

The good news is that Protein provides all that's necessary to fix those errors automatically, you just have to follow these commands:

$ go get -u github.com/znly/linkname-gen
$ go generate ./vendor/github.com/znly/protein/...

And voila, it compiles again!


Table of Contents:

Usage

Quickstart

This quickstart demonstrates the use of the Protein package in order to:

  • initialize a Transcoder
  • sniff the local protobuf schemas from memory
  • synchronize the local schema-database with a remote datastore (redis in this example)
  • use a Transcoder to encode & decode protobuf payloads using an already known schema
  • use a Transcoder to decode protobuf payloads without any prior knowledge of their schema

The complete code for the following quickstart can be found here.

Prerequisites

First, we need to set up a local redis server that will be used as a schema registry later on:

$ docker run -p 6379:6379 --name schema-reg --rm redis:3.2 redis-server

Then we open up a pool of connections to this server using garyburd/redigo:

p := &redis.Pool{
  Dial: func() (redis.Conn, error) {
    return redis.DialURL("redis://localhost:6379/0")
  },
}
defer p.Close()

Initializing a Transcoder

// initialize a `Transcoder` that is transparently kept in-sync with
// a `redis` datatore.
trc, err := NewTranscoder(
  // this context defines the timeout & deadline policies when pushing
  // schemas to the local `redis`; i.e. it is forwarded to the
  // `TranscoderSetter` function that is passed below
  context.Background(),
  // the schemas found in memory will be versioned using a MD5 hash
  // algorithm, prefixed by the 'PROT-' string
  protoscan.MD5, "PROT-",
  // configure the `Transcoder` to push every protobuf schema it can find
  // in memory into the specified `redis` connection pool
  TranscoderOptSetter(NewTranscoderSetterRedis(p)),
  // configure the `Transcoder` to query the given `redis` connection pool
  // when it cannot find a specific protobuf schema in its local cache
  TranscoderOptGetter(NewTranscoderGetterRedis(p)),
)

Now that the Transcoder has been initialized, the local redis datastore should contain all the protobuf schemas that were sniffed from memory, as defined by their respective versioning hashes:

$ docker run -it --link schema-reg:redis --rm redis:3.2 redis-cli -h redis -p 6379 -c KEYS '*'

  1) "PROT-31c64ad1c6476720f3afee6881e6f257"
  2) "PROT-56b347c6c212d3176392ab9bf5bb21ee"
  3) "PROT-c2dbc910081a372f31594db2dc2adf72"
  4) "PROT-09595a7e58d28b081d967b69cb00e722"
  5) "PROT-05dc5bd440d980600ecc3f1c4a8e315d"
  6) "PROT-8cbb4e79fdeadd5f0ff0971bbf7de31e"
     ... etc ...

Encoding stuff

We'll create a simple object for testing encoding & decoding functions, using the TestSchemaXXX protobuf schema defined here:

obj := &test.TestSchemaXXX{
  Ids: map[int32]string{
    42:  "the-answer",
    666: "the-devil",
  },
}

Encoding is nothing spectacular, the Transcoder will hide the "hard" work of bundling the versioning metadata within the payload:

// wrap the object and its versioning metadata within a `ProtobufPayload` object,
// then serialize the bundle as a protobuf binary blob
payload, err := trc.Encode(obj)
if err != nil {
  log.Fatal(err)
}

Decoding stuff

We'll try to decode the previous payload into the following object:

var myObj test.TestSchemaXXX

Trying to decode the payload using the standard proto.Unmarshal method will fail in quite cryptic ways since vanilla protobuf is unaware of how Protein bundles the versioning metadata within the payload... Don't do this!

_ = proto.Unmarshal(payload, &myObj) // NOPE NOPE NOPE!

Using the Transcoder, on the other hand, will allow to properly unbundle the data from the metadata before unmarshalling the payload:

err = trc.DecodeAs(payload, &myObj)
if err != nil {
  log.Fatal(err)
}
fmt.Println(myObj.Ids[42]) // prints the answer!

Decoding stuff dynamically (!)

Runtime-decoding does not require any more effort on the part of the end-user than simple decoding does, although a lot of stuff is actually happening behind-the-scenes:

  1. the versioning metadata is extracted from the payload
  2. the corresponding schema as well as its dependencies are lazily fetched from the redis datastore if they're not already available in the local cache (using the TranscoderGetter that was passed to the constructor)
  3. a structure-type definition is created from these schemas using Go's reflection APIs, with the right protobuf tags & hints for the protobuf deserializer to do its thing
  4. an instance of this structure is created, then the payload is unmarshalled into it
myRuntimeObj, err := trc.Decode(context.Background(), payload)
if err != nil {
  log.Fatal(err)
}
myRuntimeIDs := myRuntimeObj.Elem().FieldByName("IDs")
fmt.Println(myRuntimeIDs.MapIndex(reflect.ValueOf(int32(666)))) // prints the devil!
Error handling

Protein uses the pkg/errors package to handle error propagation throughout the call stack; please take a look at the related documentation for more information on how to properly handle these errors.

Logging

Protein rarely logs, but when it does, it uses the global logger from Uber's Zap package.
You can thus control the behavior of Protein's logger however you like by calling zap.ReplaceGlobals at your convenience.

For more information, see Zap's documentation.

Monitoring

Protein does not offer any kind of monitoring hooks, yet.

Performance

Configuration:

MacBook Pro (Retina, 15-inch, Mid 2015)
Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz

Encoding:

## gogo/protobuf ##

BenchmarkTranscoder_Encode/gogo/protobuf        300000    4287 ns/op
BenchmarkTranscoder_Encode/gogo/protobuf-2     1000000    2195 ns/op
BenchmarkTranscoder_Encode/gogo/protobuf-4     1000000    1268 ns/op
BenchmarkTranscoder_Encode/gogo/protobuf-8     1000000    1258 ns/op
BenchmarkTranscoder_Encode/gogo/protobuf-24    1000000    1536 ns/op

## znly/protein ##

BenchmarkTranscoder_Encode/znly/protein         300000    5556 ns/op
BenchmarkTranscoder_Encode/znly/protein-2       500000    2680 ns/op
BenchmarkTranscoder_Encode/znly/protein-4      1000000    1638 ns/op
BenchmarkTranscoder_Encode/znly/protein-8      1000000    1798 ns/op
BenchmarkTranscoder_Encode/znly/protein-24     1000000    2288 ns/op

Decoding:

## gogo/protobuf ##

BenchmarkTranscoder_DecodeAs/gogo/protobuf      300000    5970 ns/op
BenchmarkTranscoder_DecodeAs/gogo/protobuf-2    500000    3226 ns/op
BenchmarkTranscoder_DecodeAs/gogo/protobuf-4   1000000    2125 ns/op
BenchmarkTranscoder_DecodeAs/gogo/protobuf-8   1000000    2015 ns/op
BenchmarkTranscoder_DecodeAs/gogo/protobuf-24  1000000    2380 ns/op

## znly/protein ##

BenchmarkTranscoder_Decode/znly/protein        200000     6777 ns/op
BenchmarkTranscoder_Decode/znly/protein-2      500000     3986 ns/op
BenchmarkTranscoder_Decode/znly/protein-4      500000     2630 ns/op
BenchmarkTranscoder_Decode/znly/protein-8      500000     2973 ns/op
BenchmarkTranscoder_Decode/znly/protein-24     500000     3037 ns/op

See transcoder_test.go for the actual benchmarking code.

Contributing

Contributions of any kind are welcome; whether it is to fix a bug, clarify some documentation/comments or simply correct english mistakes and typos: do feel free to send us a pull request.

Protein is pretty-much frozen in terms of features; if you still find it to be lacking something, please file an issue to discuss it first.
Also, do not hesitate to open an issue if some piece of documentation looks either unclear or incomplete to you, nay is just missing entirely.

Code contributions must be thoroughly tested and documented.

Running tests
$ docker-compose -f test/docker-compose.yml up
$ ## wait for the datastores to be up & running, then
$ make test
Running benchmarks
$ make bench

Authors

See AUTHORS for the list of contributors.

See also

License License

The Apache License version 2.0 (Apache2) - see LICENSE for more details.

Copyright (c) 2017 Zenly hello@zen.ly @zenlyapp

Documentation

Overview

Package protein is an encoding/decoding library for Protobuf that comes with schema-versioning and runtime-decoding capabilities.

It has diverse use-cases, including but not limited to:

  • setting up schema registries
  • decoding Protobuf payloads without the need to know their schema at compile-time
  • identifying & preventing applicative bugs and data corruption issues
  • creating custom-made container formats for on-disk storage
  • ...and more!
Example

This example demonstrates the use the Protein package in order to: initialize a `Transcoder`, sniff the local protobuf schemas from memory, synchronize the local schema-database with a remote datastore (here `redis`), use a `Transcoder` to encode & decode protobuf payloads using an already known schema, use a `Transcoder` to decode protobuf payloads without any prior knowledge of their schema.

// A local `redis` server must be up & running for this example to work:
//
//   $ docker run -p 6379:6379 --name my-redis --rm redis:3.2 redis-server

// open up a new `redis` connection pool
p := &redis.Pool{
	Dial: func() (redis.Conn, error) {
		return redis.DialURL("redis://localhost:6379/0")
	},
}
defer p.Close()

/* INITIALIZATION */

// initialize a `Transcoder` that is transparently kept in-sync with
// a `redis` datatore.
trc, err := NewTranscoder(
	// this context defines the timeout & deadline policies when pushing
	// schemas to the local `redis`; i.e. it is forwarded to the
	// `TranscoderSetter` function that is passed below
	context.Background(),
	// the schemas found in memory will be versioned using a MD5 hash
	// algorithm, prefixed by the 'PROT-' string
	protoscan.MD5, "PROT-",
	// configure the `Transcoder` to push every protobuf schema it can find
	// in memory into the specified `redis` connection pool
	TranscoderOptSetter(NewTranscoderSetterRedis(p)),
	// configure the `Transcoder` to query the given `redis` connection pool
	// when it cannot find a specific protobuf schema in its local cache
	TranscoderOptGetter(NewTranscoderGetterRedis(p)),
)

// At this point, the local `redis` datastore should contain all the
// protobuf schemas known to the `Transcoder`, as defined by their respective
// versioning hashes:
//
//   $ docker run -it --link my-redis:redis --rm redis:3.2 redis-cli -h redis -p 6379 -c KEYS '*'
//
//   1) "PROT-31c64ad1c6476720f3afee6881e6f257"
//   2) "PROT-56b347c6c212d3176392ab9bf5bb21ee"
//   3) "PROT-c2dbc910081a372f31594db2dc2adf72"
//   4) "PROT-09595a7e58d28b081d967b69cb00e722"
//   5) "PROT-05dc5bd440d980600ecc3f1c4a8e315d"
//   6) "PROT-8cbb4e79fdeadd5f0ff0971bbf7de31e"
//   ... etc ...

/* ENCODING */

// create a simple object to be serialized
ts, _ := types.TimestampProto(time.Now())
obj := &test.TestSchemaXXX{
	Ids: map[int32]string{
		42:  "the-answer",
		666: "the-devil",
	},
	Ts: *ts,
}

// wrap the object and its versioning metadata within a `ProtobufPayload`
// object, then serialize the bundle as a protobuf binary blob
payload, err := trc.Encode(obj)
if err != nil {
	log.Fatal(err)
}

/* DECODING */

var myObj test.TestSchemaXXX

// /!\ This will fail in cryptic ways since vanilla protobuf is unaware
// of how Protein bundles the versioning metadata within the payload...
// Don't do this!
_ = proto.Unmarshal(payload, &myObj)

// this will properly unbundle the data from the metadata before
// unmarshalling the payload
err = trc.DecodeAs(payload, &myObj)
if err != nil {
	log.Fatal(err)
}
fmt.Println("A:", myObj.Ids[42]) // prints the answer!

/* RUNTIME-DECODING */

// empty the content of the local cache of protobuf schemas in order to
// make sure that the `Transcoder` will have to lazily fetch the schema
// and its dependencies from the `redis` datastore during the decoding
trc.sm = NewSchemaMap()

// the `Transcoder` will do a lot of stuff behind the scenes so it can
// successfully decode the payload:
// 1. the versioning metadata is extracted from the payload
// 2. the corresponding schema as well as its dependencies are lazily
//    fetched from the `redis` datastore (using the `TranscoderGetter` that
//    was passed to the constructor)
// 3. a structure-type definition is created from these schemas using Go's
//    reflection APIs, with the right protobuf tags & hints for the protobuf
//    deserializer to do its thing
// 4. an instance of this structure is created, then the payload is
//    unmarshalled into it
myRuntimeObj, err := trc.Decode(context.Background(), payload)
if err != nil {
	log.Fatal(err)
}
myRuntimeIDs := myRuntimeObj.Elem().FieldByName("IDs")
fmt.Println("B:", myRuntimeIDs.MapIndex(reflect.ValueOf(int32(666)))) // prints the devil!
Output:

A: the-answer
B: the-devil

Index

Examples

Constants

This section is empty.

Variables

View Source
var (
	// TranscoderOptGetter is used to configure the `TranscoderGetter` used by
	// the `Transcoder`.
	// See `TranscoderGetter` documentation for more information.
	TranscoderOptGetter = func(getter TranscoderGetter) TranscoderOpt {
		return func(trc *Transcoder) { trc.getter = getter }
	}
	// TranscoderOptSetter is used to configure the `TranscoderSetter` used by
	// the `Transcoder`.
	// See `TranscoderSetter` documentation for more information.
	TranscoderOptSetter = func(setter TranscoderSetter) TranscoderOpt {
		return func(trc *Transcoder) { trc.setter = setter }
	}
	// TranscoderOptSerializer is used to configure the `TranscoderSerializer`
	// used by the `Transcoder`.
	// See `TranscoderSerializer` documentation for more information.
	TranscoderOptSerializer = func(serializer TranscoderSerializer) TranscoderOpt {
		return func(trc *Transcoder) { trc.serializer = serializer }
	}
	// TranscoderOptDeserializer is used to configure the `TranscoderDeserializer`
	// used by the `Transcoder`.
	// See `TranscoderDeserializer` documentation for more information.
	TranscoderOptDeserializer = func(deserializer TranscoderDeserializer) TranscoderOpt {
		return func(trc *Transcoder) { trc.deserializer = deserializer }
	}
)

Functions

func CreateStructType

func CreateStructType(schemaUID string, sm *SchemaMap) (reflect.Type, error)

CreateStructType constructs a new structure-type definition at runtime from a `ProtobufSchema` tree.

This newly-created structure embeds all the necessary tags & hints for the protobuf SDK to deserialize payloads into it. I.e., it allows for runtime-decoding of protobuf payloads.

This is a complex and costly operation, it is strongly recommended to cache the result like the `Transcoder` does.

It requires the new `reflect` APIs provided by Go 1.7+.

Example

This simple example demonstrates how to use the protostruct API in order to create a structure-type at runtime for a given protobuf schema, specified by its fully-qualified name.

// sniff all of local protobuf schemas and store them in a `SchemaMap`
sm, err := ScanSchemas(protoscan.SHA256, "PROT-")
if err != nil {
	zap.L().Fatal(err.Error())
}

// create a structure-type definition for the '.test.TestSchemaXXX'
// protobuf schema
structType, err := CreateStructType(
	sm.GetByFQName(".test.TestSchemaXXX").SchemaUID, sm,
)
if err != nil {
	zap.L().Fatal(err.Error())
}

// pretty-print the resulting structure-type
structType = tagcleaner.Clean(structType) // remove tags to ease reading
b, err := format.Source(                  // gofmt
	[]byte(fmt.Sprintf("type TestSchemaXXX %s", structType)),
)
if err != nil {
	zap.L().Fatal(err.Error())
}
fmt.Println(string(b))
Output:

type TestSchemaXXX struct {
	SchemaUID string
	FQNames   string
	Deps      struct {
		Key   string
		Value string
	}
	IDs string
	TS  struct {
		Seconds int64
		Nanos   int32
	}
	Ots struct {
		TS struct {
			Seconds int64
			Nanos   int32
		}
	}
	Nss struct {
		Key   string
		Value string
	}
}

Types

type ProtobufPayload

type ProtobufPayload struct {
	// SchemaUID is the unique, deterministic & versioned identifier of the
	// `payload`'s schema.
	SchemaUID string `protobuf:"bytes,1,opt,name=schema_uid,json=schemaUid,proto3" json:"schema_uid,omitempty"`
	// Payload is the actual, marshaled protobuf payload.
	Payload []byte `protobuf:"bytes,2,opt,name=payload,proto3" json:"payload,omitempty"`
}

ProtobufPayload is a protobuf payload annotated with the unique versioning identifier of its schema.

This allows a `ProtobufPayload` to be decoded at runtime using Protein's `Transcoder`.

See `ScanSchemas`'s documentation for more information.

func (*ProtobufPayload) Descriptor

func (*ProtobufPayload) Descriptor() ([]byte, []int)

func (*ProtobufPayload) GetPayload

func (m *ProtobufPayload) GetPayload() []byte

func (*ProtobufPayload) GetSchemaUID

func (m *ProtobufPayload) GetSchemaUID() string

func (*ProtobufPayload) GoString

func (this *ProtobufPayload) GoString() string

func (*ProtobufPayload) ProtoMessage

func (*ProtobufPayload) ProtoMessage()

func (*ProtobufPayload) Reset

func (m *ProtobufPayload) Reset()

func (*ProtobufPayload) String

func (m *ProtobufPayload) String() string

type ProtobufSchema

type ProtobufSchema struct {
	// SchemaUID is the unique, deterministic & versioned identifier of this
	// schema.
	SchemaUID string `protobuf:"bytes,1,opt,name=schema_uid,json=schemaUid,proto3" json:"schema_uid,omitempty"`
	// FQName is the fully-qualified name of this schema,
	// e.g. `.google.protobuf.Timestamp`.
	FQName string `protobuf:"bytes,2,opt,name=fq_name,json=fqName,proto3" json:"fq_name,omitempty"`
	// Descriptor is either a Message or an Enum protobuf descriptor.
	//
	// Types that are valid to be assigned to Descr:
	//	*ProtobufSchema_Message
	//	*ProtobufSchema_Enum
	Descr isProtobufSchema_Descr `protobuf_oneof:"descr"`
	// Deps contains every direct and indirect dependencies that this schema
	// relies on.
	//
	// Key: the dependency's `schemaUID`
	// Value: the dependency's fully-qualified name
	Deps map[string]string `` /* 132-byte string literal not displayed */
}

ProtobufSchema is a versioned protobuf Message or Enum descriptor that can be used to decode `ProtobufPayload`s at runtime.

See `ScanSchemas`'s documentation for more information.

func (*ProtobufSchema) Descriptor

func (*ProtobufSchema) Descriptor() ([]byte, []int)

func (*ProtobufSchema) GetDeps

func (m *ProtobufSchema) GetDeps() map[string]string

func (*ProtobufSchema) GetDescr

func (m *ProtobufSchema) GetDescr() isProtobufSchema_Descr

func (*ProtobufSchema) GetEnum

func (*ProtobufSchema) GetFQName

func (m *ProtobufSchema) GetFQName() string

func (*ProtobufSchema) GetMessage

func (*ProtobufSchema) GetSchemaUID

func (m *ProtobufSchema) GetSchemaUID() string

func (*ProtobufSchema) GoString

func (this *ProtobufSchema) GoString() string

func (*ProtobufSchema) ProtoMessage

func (*ProtobufSchema) ProtoMessage()

func (*ProtobufSchema) Reset

func (m *ProtobufSchema) Reset()

func (*ProtobufSchema) String

func (m *ProtobufSchema) String() string

func (*ProtobufSchema) XXX_OneofFuncs

func (*ProtobufSchema) XXX_OneofFuncs() (func(msg proto.Message, b *proto.Buffer) error, func(msg proto.Message, tag, wire int, b *proto.Buffer) (bool, error), func(msg proto.Message) (n int), []interface{})

XXX_OneofFuncs is for the internal use of the proto package.

type ProtobufSchema_Enum

type ProtobufSchema_Enum struct {
	Enum *google_protobuf.EnumDescriptorProto `protobuf:"bytes,31,opt,name=enm,oneof"`
}

func (*ProtobufSchema_Enum) GoString

func (this *ProtobufSchema_Enum) GoString() string

type ProtobufSchema_Message

type ProtobufSchema_Message struct {
	Message *google_protobuf.DescriptorProto `protobuf:"bytes,30,opt,name=msg,oneof"`
}

func (*ProtobufSchema_Message) GoString

func (this *ProtobufSchema_Message) GoString() string

type SchemaMap

type SchemaMap struct {
	// contains filtered or unexported fields
}

SchemaMap is a thread-safe mapping & reverse-mapping of `ProtobufSchema`s.

It atomically maintains two data-structures in parallel: a map of schemaUIDs to `ProtobufSchema`s, and a map of fully-qualified schema names to schemaUIDs.

The `SchemaMap` is the main data-structure behind a `Transcoder`, used to store and retrieve every `ProtobufSchema`s that have been cached locally.

func NewSchemaMap

func NewSchemaMap() *SchemaMap

NewSchemaMap returns a new SchemaMap with its maps & locks pre-allocated.

func ScanSchemas

func ScanSchemas(
	hasher protoscan.Hasher, hashPrefix string, failOnDuplicate ...bool,
) (*SchemaMap, error)

ScanSchemas retrieves every protobuf schema instanciated by any of the currently loaded protobuf libraries (e.g. `golang/protobuf`, `gogo/protobuf`...), computes the dependency graphs that link them, builds the `ProtobufSchema` objects then returns a new `SchemaMap` filled with all those schemas.

A `ProtobufSchema` is a data-structure that holds a protobuf descriptor as well as a map of all its dependencies' schemaUIDs. A schemaUID uniquely & deterministically identifies a protobuf schema based on its descriptor and all of its dependencies' descriptors. It essentially is the versioned identifier of a schema. For more information, see `ProtobufSchema`'s documentation as well as the `protoscan` implementation, especially `descriptor_tree.go`.

The specified `hasher` is the actual function used to compute these schemaUIDs, see the `protoscan.Hasher` documentation for more information. The `hashPrefix` string will be preprended to the resulting hash that was computed via the `hasher` function. E.g. by passing `protoscan.MD5` as a `Hasher` and `PROT-` as a `hashPrefix`, the resulting schemaUIDs will be of the form 'PROT-<MD5hex>'.

As a schema and/or its dependencies follow their natural evolution, each and every historic version of them will thus have been stored with their own unique identifiers.

`failOnDuplicate` is an optional parameter that defaults to true; have a look at `ScanSchemas` implementation to understand what it does and when (if ever) would you need to set it to false instead.

Finally, have a look at the `protoscan` sub-packages as a whole for more information about how all of this machinery works; the code is heavily documented.

Example

This example demonstrates how to use Protein's `ScanSchemas` function in order to sniff all the locally instanciated schemas into a `SchemaMap`, then walk over this map in order to print each schema as well as its dependencies.

// sniff local protobuf schemas into a `SchemaMap` using a MD5 hasher,
// and prefixing each resulting UID with 'PROT-'
sm, err := ScanSchemas(protoscan.MD5, "PROT-")
if err != nil {
	zap.L().Fatal(err.Error())
}

// walk over the map to print the schemas and their respective dependencies
var output string
sm.ForEach(func(ps *ProtobufSchema) error {
	// discard schemas not orginating from protein's test package
	if !strings.HasPrefix(ps.GetFQName(), ".test") {
		return nil
	}
	output += fmt.Sprintf("[%s] %s\n", ps.GetSchemaUID(), ps.GetFQName())
	for uid, name := range ps.GetDeps() {
		// discard dependencies not orginating from protein's test package
		if strings.HasPrefix(name, ".test") {
			output += fmt.Sprintf(
				"[%s] depends on: [%s] %s\n", ps.GetSchemaUID(), uid, name,
			)
		}
	}
	return nil
})

// sort rows so the output stays deterministic
rows := strings.Split(output, "\n")
sort.Strings(rows)
for i, r := range rows { // prettying
	if strings.Contains(r, "depends on") {
		rows[i] = "\t" + strings.Join(strings.Split(r, " ")[1:], " ")
	}
}
output = strings.Join(rows, "\n")
fmt.Println(output)
Output:

[PROT-048ddab197df688302a76296293ba101] .test.OtherTestSchemaXXX
[PROT-393cb6dc1b4fc350cf10ca99f429301d] .test.TestSchemaXXX.WeatherType
[PROT-4f6928d2737ba44dac0e3df123f80284] .test.TestSchema.DepsEntry
[PROT-528e869395e00dd5525c2c2c69bbd4d0] .test.TestSchema
	depends on: [PROT-4f6928d2737ba44dac0e3df123f80284] .test.TestSchema.DepsEntry
[PROT-6926276ca6306966d1a802c3b8f75298] .test.TestSchemaXXX.IdsEntry
[PROT-c2dbc910081a372f31594db2dc2adf72] .test.TestSchemaXXX
	depends on: [PROT-048ddab197df688302a76296293ba101] .test.OtherTestSchemaXXX
	depends on: [PROT-6926276ca6306966d1a802c3b8f75298] .test.TestSchemaXXX.IdsEntry
	depends on: [PROT-c43da9745d68bd3cb97dc0f4905f3279] .test.TestSchemaXXX.NestedEntry
	depends on: [PROT-f6be24770f6e8d5edc8ef12c94a23010] .test.TestSchemaXXX.DepsEntry
[PROT-c43da9745d68bd3cb97dc0f4905f3279] .test.TestSchemaXXX.NestedEntry
[PROT-f6be24770f6e8d5edc8ef12c94a23010] .test.TestSchemaXXX.DepsEntry
	depends on: [PROT-c43da9745d68bd3cb97dc0f4905f3279] .test.TestSchemaXXX.NestedEntry

func (*SchemaMap) Add

func (sm *SchemaMap) Add(schemas map[string]*ProtobufSchema) *SchemaMap

Add walks over the given map of `schemas` and add them to the `SchemaMap` while making sure to atomically maintain both the internal map and reverse-map.

func (*SchemaMap) ForEach

func (sm *SchemaMap) ForEach(f func(s *ProtobufSchema) error) error

ForEach applies the specified function `f` to every entry in the `SchemaMap`.

ForEach is guaranteed to see a consistent view of the internal mapping.

func (*SchemaMap) GetByFQName

func (sm *SchemaMap) GetByFQName(fqName string) *ProtobufSchema

GetByFQName returns the first `ProtobufSchema`s that matches the specified fully-qualified name (e.g. `.google.protobuf.timestamp`).

If more than one version of the same schema are stored in the `SchemaMap`, a fully-qualified name will naturally point to several distinct schemaUIDs. When this happens, `GetByFQName` will always return the first one to had been inserted in the map.

This is thread-safe.

func (*SchemaMap) GetByUID

func (sm *SchemaMap) GetByUID(schemaUID string) *ProtobufSchema

GetByUID returns the `ProtobufSchema` associated with the specified `schemaUID`.

This is thread-safe.

func (*SchemaMap) Size

func (sm *SchemaMap) Size() int

Size returns the number of schemas stored in the `SchemaMap`.

type Transcoder

type Transcoder struct {
	// contains filtered or unexported fields
}

A Transcoder is a protobuf encoder/decoder with schema-versioning as well as runtime-decoding capabilities.

func NewTranscoder

func NewTranscoder(ctx context.Context,
	hasher protoscan.Hasher, hashPrefix string, opts ...TranscoderOpt,
) (*Transcoder, error)

NewTranscoder returns a new `Transcoder`.

See `ScanSchemas`'s documentation for more information regarding the use of `hasher` and `hashPrefix`.

See `TranscoderOpt`'s documentation for the list of available options.

The given context is passed to the user-specified `TranscoderSetter`, if any.

func (*Transcoder) Decode

func (t *Transcoder) Decode(
	ctx context.Context, payload []byte,
) (reflect.Value, error)

Decode decodes the given protein-encoded `payload` into a dynamically generated structure-type.

It is used when you need to work with protein-encoded data in a completely agnostic way (e.g. when you merely know the respective names of the fields you're interested in, such as a generic data-enricher for example).

When decoding a specific version of a schema for the first-time in the lifetime of a `Transcoder`, a structure-type must be created from the dependency tree of this schema. This is a costly operation that involves a lot of reflection, see `CreateStructType` documentation for more information. Fortunately, the resulting structure-type is cached so that it can be freely re-used by later calls to `Decode`; i.e. you pay the price only once.

Also, when trying to decode a specific schema for the first-time, `Decode` might not have all of the dependencies directly available in its local `SchemaMap`, in which case it will call the user-defined `TranscoderGetter` in the hope that it might return these missing dependencies. This user-defined function may or may not do some kind of I/O; the given context will be passed to it.

Once again, this price is paid only once.

func (*Transcoder) DecodeAs

func (t *Transcoder) DecodeAs(payload []byte, msg proto.Message) error

DecodeAs decodes the given protein-encoded `payload` into the specified protobuf `Message` using the standard protobuf methods, thus bypassing all of the runtime-decoding and schema versioning machinery.

It is very often used when you need to work with protein-encoded data in a non-agnostic way (i.e. when you know beforehand how you want to decode and interpret the data).

`DecodeAs` basically adds zero overhead compared to a straightforward `proto.Unmarshal` call.

`DecodeAs` never does any kind of I/O.

func (*Transcoder) Encode

func (t *Transcoder) Encode(msg proto.Message, fqName ...string) ([]byte, error)

Encode bundles the given protobuf `Message` and its associated versioning metadata together within a `ProtobufPayload`, marshals it all together in a byte-slice then returns the result.

`Encode` needs the message's fully-qualified name in order to reverse-lookup its schemaUID (i.e. its versioning hash).

In order to find this name, it will look at different places until either one of those does return a result or none of them does, in which case the encoding will fail. In search order, those places are: 1. first, the `fqName` parameter is checked; if it isn't set, then 2. the `golang/protobuf` package is queried for the FQN; if it isn't available there then 3. finally, the `gogo/protobuf` package is queried too, as a last resort.

Note that a single fully-qualified name might point to multiple schemaUIDs if multiple versions of that schema are currently available in the `SchemaMap`. When this happens, the first schemaUID from the list will be used, which corresponds to the first version of the schema to have ever been added to the local `SchemaMap` (i.e. the oldest one).

type TranscoderDeserializer

type TranscoderDeserializer func(payload []byte, ps *ProtobufSchema) error

A TranscoderDeserializer is used to deserialize the payloads returned by a `TranscoderGetter` into a `ProtobufSchema`. See `TranscoderGetter` documentation for more information.

The default `TranscoderDeserializer` unwraps the schema from its `ProtobufPayload` wrapper; i.e. it uses Protein's decoding to decode the schema.

type TranscoderGetter

type TranscoderGetter func(ctx context.Context, schemaUID string) ([]byte, error)

A TranscoderGetter is called by the `Transcoder` when it cannot find a specific `schemaUID` in its local `SchemaMap`.

The function returns a byte-slice that will be deserialized into a `ProtobufSchema` by a `TranscoderDeserializer` (see below).

A `TranscoderGetter` is typically used to fetch `ProtobufSchema`s from a remote data-store. To that end, several ready-to-use implementations are provided by this package for different protocols: memcached, redis & CQL (i.e. cassandra). See `transcoder_helpers.go` for more information.

The default `TranscoderGetter` always returns a not-found error.

func NewTranscoderGetterCassandra

func NewTranscoderGetterCassandra(s *gocql.Session,
	table, keyCol, dataCol string,
) TranscoderGetter

NewTranscoderGetterCassandra returns a `TranscoderGetter` suitable for querying a binary blob from a cassandra-compatible store.

The <table> column-family is expected to have (at least) the following columns:

TABLE (
  <keyCol> ascii,
  <dataCol> blob,
PRIMARY KEY (<keyCol>))

The given context is forwarded to `gocql`.

func NewTranscoderGetterMemcached

func NewTranscoderGetterMemcached(c *memcache.Client) TranscoderGetter

NewTranscoderGetterMemcached returns a `TranscoderGetter` suitable for querying a binary blob from a memcached-compatible store.

The specified context will be ignored.

func NewTranscoderGetterRedis

func NewTranscoderGetterRedis(p *redis.Pool) TranscoderGetter

NewTranscoderGetterRedis returns a `TranscoderGetter` suitable for querying a binary blob from a redis-compatible store.

The specified context will be ignored.

type TranscoderOpt

type TranscoderOpt func(trc *Transcoder)

A TranscoderOpt is passed to the `Transcoder` constructor to configure various options.

type TranscoderSerializer

type TranscoderSerializer func(ps *ProtobufSchema) ([]byte, error)

A TranscoderSerializer is used to serialize `ProtobufSchema`s before passing them to a `TranscoderSetter`. See `TranscoderSetter` documentation for more information.

The default `TranscoderSerializer` wraps the schema within a `ProtobufPayload`; i.e. it uses Protein's encoding to encode the schema.

type TranscoderSetter

type TranscoderSetter func(ctx context.Context, schemaUID string, payload []byte) error

A TranscoderSetter is called by the `Transcoder` for every schema that it can find in memory.

The function receives a byte-slice that corresponds to a `ProtobufSchema` which has been previously serialized by a `TranscoderSerializer` (see below).

A `TranscoderSetter` is typically used to push the local `ProtobufSchema`s sniffed from memory into a remote data-store. To that end, several ready-to-use implementations are provided by this package for different protocols: memcached, redis & CQL (i.e. cassandra). See `transcoder_helpers.go` for more information.

The default `TranscoderSetter` is a no-op.

func NewTranscoderSetterCassandra

func NewTranscoderSetterCassandra(s *gocql.Session,
	table, keyCol, dataCol string,
) TranscoderSetter

NewTranscoderSetterCassandra returns a `TranscoderSetter` suitable for setting a binary blob into a redis-compatible store.

The <table> column-family is expected to have (at least) the following columns:

TABLE (
  <keyCol> ascii,
  <dataCol> blob,
PRIMARY KEY (<keyCol>))

The given context is forwarded to `gocql`.

func NewTranscoderSetterMemcached

func NewTranscoderSetterMemcached(c *memcache.Client) TranscoderSetter

NewTranscoderSetterMemcached returns a `TranscoderSetter` suitable for setting a binary blob into a memcached-compatible store.

The specified context will be ignored.

func NewTranscoderSetterRedis

func NewTranscoderSetterRedis(p *redis.Pool) TranscoderSetter

NewTranscoderSetterRedis returns a `TranscoderSetter` suitable for setting a binary blob into a redis-compatible store.

The specified context will be ignored.

Directories

Path Synopsis
Package failure lists all the possible errors that can be returned either by `protein` or any of its sub-packages.
Package failure lists all the possible errors that can be returned either by `protein` or any of its sub-packages.
internal
goobj
Package goobj implements reading of Go object files and archives.
Package goobj implements reading of Go object files and archives.
obj
objfile
Package objfile implements portable access to OS-specific executable files.
Package objfile implements portable access to OS-specific executable files.
sys
protobuf
test
Package test is a generated protocol buffer package.
Package test is a generated protocol buffer package.
Package protoscan provides the necessary tools & APIs to find, extract, version and build the dependency trees of all the protobuf schemas that have been instanciated by one or more protobuf library (golang/protobuf, gogo/protobuf...).
Package protoscan provides the necessary tools & APIs to find, extract, version and build the dependency trees of all the protobuf schemas that have been instanciated by one or more protobuf library (golang/protobuf, gogo/protobuf...).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL