bin

package module
v0.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 28, 2023 License: MIT Imports: 7 Imported by: 1

README

Binmap

I've found that using the stdlib binary interface to read and write data is a little cumbersome and tedious, since any operation can result in an error. While this makes sense given the problem domain, the API leaves something to be desired.

I'd love to have a way to batch operations, so I don't have so much if err != nil. If an error occurs at any point, then I'm able to fail fast and handle one error at the end.

I'd also like to work easily with io.Readers rather than having to read everything into memory first to dissect it piecemeal. While this can be accomplished with binary.Read, I still have the issue of too much error handling code cluttered around the code I want to write.

Goals

  • I'd like to have an easier to use interface for reading/writing binary data.
  • I'd like to declare binary IO operations, execute them, and handle a single error at the end.
  • I'd like to be able to reuse binary IO operations, and even pass them into more complex pipelines.
  • I'd like to be able to declare dynamic behavior, like when the size of the next read is determined by the current field.
  • I'd like to declare a read loop based on a read field value, and pass the loop construct to a larger pipeline.
  • Struct tag field binding would be fantastic, but reflection is... fraught. I'll see how this goes, and I'll probably take some hints from how the stdlib is handling this.
    • There's too much possibility of dynamic or dependent logic with a lot of binary payloads, and the number of edge cases for implementing this is more than I want to deal with.
    • I'm pretty happy with the API for mapping definition so far, and I'd rather simplify that than get into reflection with struct field tags. I feel like it's much more understandable (and thus maintainable) code.

How it works

This package centers around the Mapper interface. A mapper is anything that knows how to read and write binary data, and is easily adaptable to new data types with custom logic with the Any mapper.

Any given Mapper is expected to be short-lived, especially if the underlying data representation in Go code is changing often. This mechanism makes heavy use of pointers, and even pointer-pointers in some cases, which means that there's a fair bit of indirection to make this work. There are also a lot of generics in this code to both limit input types to what is readily supported, and to keep duplication to a minimum.

Note that using different mapper procedures between software versions is effectively a breaking change, and should be handled like you would handle database migrations. There are certain patterns that make this easier to work with, explained below.

Directly supported types

There are several primitive types that are directly supported. Keep in mind that type restrictions mostly come from what binary.Read and binary.Write support, and this package also inherits the design constraints of simplicity over speed mentioned in the encoding/binary docs.

  • Integers with Int.
    • Note that int and uint are not supported because these are not necessarily of a known binary size at compile time.
  • Floats with Float.
  • Booleans with Bool.
  • Bytes with Byte, and byte slices with FixedBytes and LenBytes.
  • Complex 64/128 with Complex.
  • Signed and unsigned varints with Varint/Uvarint.
  • General slice mappers are provided with Slice, LenSlice, and DynamicSlice.
  • Size types with Size, which are restricted to any known-size, unsigned integer.
  • Strings, both with FixedString for fixed-width string fields, and null-terminated strings with NullTermString.
  • More interesting types, such as Map for arbitrary maps, and even DataTable for persisting structs-of-arrays.
  • As already mentioned, the Any mapper can be used to add arbitrary mapping logic for any type you'd like to express.
    • An Any mapper just needs a ReadFunc and WriteFunc.
    • This mapper function doesn't require a target because it's intended to be flexible, and the assumption is that a target would be available in a closure context.

Common patterns

There are few assumptions made about or constraints applied to your data representation, but all persisted data must either be of a fixed size when persisted, or include an unambiguous delimiter (like a null terminator for a string). This means that you are charged with managing things like binary format migrations and validation. Binary serialization can get pretty complicated, depending on the data structures involved. Fortunately, there are some commonly used patterns and library features that help manage this complexity.

Mapper method

Expressing a mapper method that creates a consistent Mapper for your data in a struct, and then using that to expose read and write methods seems to work well in practice.

import (
	"encoding/binary"
	bin "github.com/saylorsolutions/binmap"
	"io"
)

type User struct {
	username string
}

func (u *User) mapper() bin.Mapper {
	return bin.NullTermString(&u.username)
}

func (u *User) Read(r io.Reader) error {
	return u.mapper().Read(r, binary.BigEndian)
}

func (u *User) Write(w io.Writer) error {
	return u.mapper().Write(w, binary.BigEndian)
}
Mapper Sequence

The previous pattern can be extended to map more fields with MapSequence. This provides a tremendous level of flexibility, since the result of MapSequence is itself a Mapper.

import (
	"encoding/binary"
	bin "github.com/saylorsolutions/binmap"
	"io"
)

type User struct {
	id           uint64
	username     string
	passwordHash []byte
}

func (u *User) mapper() bin.Mapper {
	return bin.MapSequence(
		bin.Int(&u.id),
		bin.NullTermString(&u.username),
		bin.DynamicSlice(&u.passwordHash, bin.Byte),
	)
}

func (u *User) Read(r io.Reader) error {
	return u.mapper().Read(r, binary.BigEndian)
}

func (u *User) Write(w io.Writer) error {
	return u.mapper().Write(w, binary.BigEndian)
}
Mapper of Mappers

Once the previous patterns have been established, extensions may be made for additional types within your data. Types included in your top-level structure can themselves have a mapper method that specifies how they are binary mapped.

Note: That the use of LenSlice is an arbitrary choice, and not a requirement of embedding slices of types in other types. It's generally preferred to use DynamicSlice unless you're encoding the length of a slice as a field in your struct, or you always know the length of a slice ahead of time.

package main

import (
	"encoding/binary"
	bin "github.com/saylorsolutions/binmap"
	"io"
)

type Contact struct {
	email          string
	allowMarketing bool
}

func (c *Contact) mapper() bin.Mapper {
	return bin.MapSequence(
		bin.FixedString(&c.email, 128),
		bin.Bool(&c.allowMarketing),
	)
}

type User struct {
	id           uint64
	username     string
	passwordHash []byte
	numContacts  uint16
	contacts     []Contact
}

func (u *User) mapper() bin.Mapper {
	return bin.MapSequence(
		bin.Int(&u.id),
		bin.NullTermString(&u.username),
		bin.DynamicSlice(&u.passwordHash, bin.Byte),
		bin.LenSlice(&u.contacts, &u.numContacts, func(c *Contact) bin.Mapper {
			return c.mapper()
		}),
	)
}

func (u *User) Read(r io.Reader) error {
	return u.mapper().Read(r, binary.BigEndian)
}

func (u *User) Write(w io.Writer) error {
	return u.mapper().Write(w, binary.BigEndian)
}

This makes reading a struct from a binary source incredibly trivial, with a single error to handle regardless of the mapping logic expressed.

func ReadUser(r io.Reader) (*User, error) {
	u := new(User)
	if err := u.Read(r); err != nil {
		return nil, err
	}
	return u, nil
}
Validated read

Input validation is important, especially in cases where changes in persisted data could lead to changes to a struct's internal, unexposed state. This can easily be added in the Read and Write methods added above to ensure that business rule constraints are encoded as part of the persistence logic.

var ErrNoContact = errors.New("all users must have a contact")

func (u *User) Read(r io.Reader) error {
  if err := u.mapper().Read(r, binary.BigEndian); err != nil {
	  return err
  }
  if len(u.contacts) == 0 {
	  return ErrNoContact
  }
}

func (u *User) Write(w io.Writer) error {
  if len(u.contacts) == 0 {
    return ErrNoContact
  }
  if err := u.mapper().Write(w, binary.BigEndian); err != nil {
    return err
  }
}
Versioned mapping

As mentioned previously, versioned mapping can be very important if the binary representation is expected to change (often or not), since it's effectively a breaking change. This can be handled pretty readily with a little forethought.

import (
	"encoding/binary"
	"errors"
	bin "github.com/saylorsolutions/binmap"
	"io"
)

type version = byte

const (
	v1 version = iota + 1
	v2
)

type User struct {
	username string
}

func (u *User) mapperV1() bin.Mapper {
	return bin.NullTermString(&u.username)
}

func (u *User) mapperV2() bin.Mapper {
	return bin.FixedString(&u.username, 32)
}

func (u *User) mapper() bin.Mapper {
	return bin.Any(
		func(r io.Reader, endian binary.ByteOrder) error {
			var v version
			if err := bin.Byte(&v).Read(r, endian); err != nil {
				return err
			}
			switch v {
			case v1:
				return u.mapperV1().Read(r, endian)
			case v2:
				return u.mapperV2().Read(r, endian)
			default:
				return errors.New("unknown version")
			}
		},
		func(w io.Writer, endian binary.ByteOrder) error {
			var v = v2
			return bin.MapSequence(
				bin.Byte(&v),
				u.mapperV2(),
			).Write(w, endian)
		},
	)
}

func (u *User) Read(r io.Reader) error {
	return u.mapper().Read(r, binary.BigEndian)
}

func (u *User) Write(w io.Writer) error {
	return u.mapper().Write(w, binary.BigEndian)
}

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrNilReadWrite = errors.New("nil read source or write target")
)
View Source
var (
	ErrUnbalancedTable = errors.New("unbalanced data table")
)

Functions

This section is empty.

Types

type AnyComplex added in v0.3.0

type AnyComplex interface {
	complex64 | complex128
}

type AnyFloat

type AnyFloat interface {
	float32 | float64
}

type AnyInt

type AnyInt interface {
	int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64
}

type FieldMapper added in v0.3.0

type FieldMapper interface {
	// contains filtered or unexported methods
}

FieldMapper provides the logic necessary to read and write DataTable fields. Created with MapField.

func MapField added in v0.3.0

func MapField[T any](target *[]T, mapFn func(*T) Mapper) FieldMapper

MapField will associate a Mapper to each element in a target slice within a FieldMapper.

type KeyMapper added in v0.3.0

type KeyMapper[K comparable] func(key *K) Mapper

type Mapper

type Mapper interface {
	// Read data from a binary source.
	Read(r io.Reader, endian binary.ByteOrder) error
	// Write data to a binary target.
	Write(w io.Writer, endian binary.ByteOrder) error
}

Mapper is any procedure that knows how to read from and write to binary data, given an endianness policy.

func Any

func Any(read ReadFunc, write WriteFunc) Mapper

Any is provided to make it easy to create a custom Mapper for any given type.

func Bool

func Bool(b *bool) Mapper

Bool will map a single boolean.

func Byte

func Byte(b *byte) Mapper

Byte will map a single byte.

func Complex added in v0.3.0

func Complex[T AnyComplex](target *T) Mapper

Complex will map a complex64/128 number.

func DataTable added in v0.3.0

func DataTable(length *uint32, mappers ...FieldMapper) Mapper

DataTable will construct a Mapper that orchestrates reading and writing a data table. This is very helpful for situations where the caller is using the array of structs to struct of arrays optimization, and wants to persist this table. Each FieldMapper will be used to read a single field element, making up a DataTable row, before returning to the first FieldMapper to start the next row. The length parameter will set during read, and read during write to ensure that all mapped fields are of the same length.

func DynamicSlice

func DynamicSlice[E any](target *[]E, mapVal func(*E) Mapper) Mapper

DynamicSlice tries to accomplish a happy medium between LenSlice and Slice. A uint32 will be used to store the size of the given slice, but it's not necessary to read this from a field, rather it will be discovered at write time. This means that the size will be available at read time by first reading the uint32 with LenSlice, without requiring a caller provided field. In a scenario where a slice in a struct is used, this makes it easier to read and write because the struct doesn't need to store the size in a field.

func FixedBytes

func FixedBytes[S SizeType](buf *[]byte, length S) Mapper

FixedBytes maps a byte slice of a known length.

func FixedString

func FixedString(s *string, length int) Mapper

FixedString will map a string with a max length that is known ahead of time. The target string will not contain any trailing zero bytes if the encoded string is less than the space allowed.

func Float

func Float[T AnyFloat](f *T) Mapper

Float will map any floating point value.

func Int

func Int[T AnyInt](i *T) Mapper

Int will map any integer, excluding int.

func LenBytes

func LenBytes[S SizeType](buf *[]byte, length *S) Mapper

LenBytes is used for situations where an arbitrarily sized byte slice is encoded after its length. This mapper will read the length, and then length number of bytes into a byte slice. The mapper will write the length and bytes in the same order.

func LenSlice

func LenSlice[E any, S SizeType](target *[]E, count *S, mapVal func(*E) Mapper) Mapper

LenSlice is for situations where a slice is encoded with its length prepended. Otherwise, this behaves exactly like Slice.

func Map added in v0.3.0

func Map[K comparable, V any](target *map[K]V, keyMapper KeyMapper[K], valMapper ValMapper[V]) Mapper

func MapSequence

func MapSequence(mappings ...Mapper) Mapper

MapSequence creates a Mapper that uses each given Mapper in order.

func NullTermString

func NullTermString(s *string) Mapper

NullTermString will read and write null-byte terminated string. The string should not contain a null terminator, one will be added on write.

func OverrideEndian added in v0.3.0

func OverrideEndian(m Mapper, endian binary.ByteOrder) Mapper

OverrideEndian will override the endian settings for a single operation. This is useful for UTF-16 strings which are often read/written little-endian.

func Size

func Size[S SizeType](size *S) Mapper

Size maps any value that can reasonably be used to express a size.

func Slice

func Slice[E any, S SizeType](target *[]E, count S, mapVal func(*E) Mapper) Mapper

Slice will produce a mapper informed from the given function to use a slice of values. The slice length must be known ahead of time. The mapVal function will be used to create a Mapper that relates to the type returned from allocNext. The returned Mapper will orchestrate the array construction according to the given function.

func Uni16FixedString added in v0.3.0

func Uni16FixedString(s *string, wcharlen int) Mapper

Uni16FixedString is the same as FixedString, except that it works with UTF-16 strings.

func Uni16NullTermString added in v0.3.0

func Uni16NullTermString(s *string) Mapper

Uni16NullTermString is the same as NullTermString, except that it works with UTF-16 strings.

func Uvarint added in v0.3.0

func Uvarint(target *uint64) Mapper

Uvarint encodes 16, 32, or 64-bit unsigned integers as a variable length integer. This is generally more efficient than reading/writing the full byte length.

func Varint added in v0.3.0

func Varint(target *int64) Mapper

Varint encodes 16, 32, or 64-bit signed integers as a variable length integer. This is generally more efficient than reading/writing the full byte length.

type ReadFunc

type ReadFunc func(r io.Reader, endian binary.ByteOrder) error

ReadFunc is a function that reads data from a binary source.

type SizeType

type SizeType interface {
	uint8 | uint16 | uint32 | uint64
}

type ValMapper added in v0.3.0

type ValMapper[V any] func(val *V) Mapper

type WriteFunc

type WriteFunc func(w io.Writer, endian binary.ByteOrder) error

WriteFunc is a function that writes data to a binary target.

Directories

Path Synopsis
example

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL