kodr

package module

v0.2.2 Latest Latest Go to latest Published: Aug 13, 2021 License: CC0-1.0 Imports: 4 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/itzmeanjan/kodr

Links

Open Source Insights

README ¶

kodr

Random Linear Network Coding

Motivation

For sometime now I've been exploring Random Linear Network Coding & while looking for implementation(s) of RLNC-based schemes I didn't find stable & maintained library in Golang, which made me take up this venture of writing kodr, which I plan to maintain & keep updated as I keep learning of new RLNC schemes & possible application domains of RLNC.

For all the RLNC-based schemes I implement in kodr i.e. {full RLNC, on-the-fly RLNC, sparse RLNC, generational RLNC, Caterpillar RLNC ...}, I write respective examples on how exposed API can be used for encoding, recoding & decoding binary data.

Background

For learning about RLNC you may want to go through my post. In kodr, I perform all finite field operations on GF(2**8) --- which is seemingly a good fit, because I consider each byte to be a symbol of RLNC, which is a finite field elements --- 256 of them. Also speaking of performance & memory consumption, GF(2**8) keeps a good balance between these. Working on higher field indeed decreases chance of ( randomly ) generating linearly dependent pieces, but requires more costly computation & if finite field operations are implemented in terms of addition/ multiplication tables then memory consumption increases to a great extent. On the other hand, working on GF(2) increases change of generating linearly dependent pieces, though with sophisticated design like Fulcrum codes, they can be proved to be beneficial. Another point is the higher the finite field, higher is the cost of storing random coding vectors --- because each element of coding vector ( read coding coefficient ) is a finite field element.

Installation

Assuming you have Golang (>=1.16) installed, add kodr as an dependency to your project, which uses GOMOD for dependency management purposes, by executing

go get -u github.com/itzmeanjan/kodr/...

Testing

After you clone this repo, you may want to run test cases

go test -v -cover ./...

Benchmarking

I write required testcases for benchmarking performance of {en, re, de}coder of implemented RLNC schemes, while I also present results after running those on consumer grade machines with configuration

Intel(R) Core(TM) i3-5005U CPU @ 2.00GHz
Intel(R) Core(TM) i5-8279U CPU @ 2.40GHz

Full RLNC

For benchmarking encoder of full RLNC, execute

go test -run=xxx -bench=Encoder ./bench/full/

Coding speed at ~ 290MB/s

benchmark_full_encoder

Looking at full RLNC recoder performance

go test -run=xxx -bench=Recoder ./bench/full/

Recoding speed at ~ 290MB/s

benchmark_full_recoder

And decoder performance denotes each round of full data reconstruction from N-many coded pieces taking X second, on average.

Note: It can be clearly understood that decoding complexity keeps increasing very fast, when using full RLNC with large data chunks. For decoding 2MB total chunk which is splitted into 256 pieces of equal sized byte slice, it takes ~6s.

benchmark_full_decoder

Systematic RLNC

Running benchmark tests on better hardware shows encoder performance improvement to quite a large extent

Average encoding speed ~660MB/s

go test -run=xxx -bench=Encoder ./bench/systematic

benchmark_systematic_encoder

Systematic RLNC decoder has an advantage over full RLNC decoder because it may get some pieces which are actually uncoded, just augmented to be coded, so it doesn't need to process those pieces, rather it'll use uncoded pieces to decode other coded pieces faster.

go test -run=xxx -bench=Decoder ./bench/systematic

For decoding 1MB whole chunk, which is splitted into 512 pieces & coded, it takes quite long time --- growth rate of decoding time is pretty high as piece count keeps increasing. It's better not to increase piece count very much, rather piece size can increase, so that we pay relatively lower decoding cost.

Notice how whole chunk size increases to 2MB, but with small piece count decoding time stays afforable.

benchmark_systematic_decoder

Usage

Examples demonstrating how to use API exposed by kodr for ( currently ) supported RLNC schemes.

In each walk through, code snippets are prepended with line numbers, denoting actual line numbers in respective file.

Full RLNC

Example: example/full/main.go

Let's start by seeding random number generator with current unix timestamp with nanosecond level precision.

22| rand.Seed(time.Now().UnixNano())

I read kodr logo, which is a PNG file, into in-memory byte array of length 3965 & compute SHA512 hash : 0xee9ec63a713ab67d82e0316d24ea646f7c5fb745ede9c462580eca5f

24| img := path.Join("..", "..", "img", "logo.png")

...

37| log.Printf("SHA512: 0x%s\n\n", hex.EncodeToString(sum))

I decide to split it into 64 pieces ( each of equal length ) & perform full RLNC, resulting into 128 coded pieces.

45| log.Printf("Coding %d pieces together\n", pieceCount)
46| enc, err := full.NewFullRLNCEncoderWithPieceCount(data, pieceCount)

...

57| log.Printf("Coded into %d pieces\n", codedPieceCount)

Then I randomly drop 32 coded pieces, simulating these are lost/ dropped. I've 96 remaining pieces, which I recode into 192 coded pieces. I random shuffle those 192 coded pieces to simulate that their reception order can arbitrarily vary. Then I randomly drop 96 pieces, leaving me with other 96 pieces.

59| for i := 0; i < int(droppedPieceCount); i++ {

...

98| log.Printf("Dropped %d pieces, remaining %d pieces\n\n", recodedPieceCount/2, len(recodedPieces))

Now I create a decoder which expects to receive 64 linearly independent pieces so that it can fully construct back kodr logo. I've 96 pieces, with no idea whether they're linearly independent or not, still I start decoding.

101| dec := full.NewFullRLNCDecoder(pieceCount)

Courageously I just add 64 coded pieces into decoder & hope all of those will be linearly independent --- turns out to be so.

This is the power of RLNC, where random coding coefficients do same job as other specially crafted codes.

Just a catch, decoded data's length is more than 3965 bytes.

124| log.Printf("Decoded into %d bytes\n", len(decoded_data)) // 3968 bytes

This is due to fact, I asked kodr to split original 3965 bytes into 64 pieces & code them together, but turns out 3965 is not properly divisible by 64, which is why kodr decided to append 3 extra bytes at end --- making it 3968 bytes. This way kodr splitted whole image into 64 equal sized pieces, where each piece size is 62 bytes.

So, SHA512-ing first 3965 bytes of decoded data slice must be equal to 0xee9ec63a713ab67d82e0316d24ea646f7c5fb745ede9c462580eca5f --- and it's so.

131| log.Printf("First %d bytes of decoded data matches original %d bytes\n", len(data), len(data))

...

137| log.Printf("SHA512: 0x%s\n", hex.EncodeToString(sum))

Finally I write back reconstructed image into PNG file.

139| if err := os.WriteFile("recovered.png", decoded_data[:len(data)], 0o644); err != nil {

...

For running this example

# assuming you're in root directory of `kodr`
cd example/full
go run main.go

This should generate example/full/recovered.png, which is exactly same as img/logo.png.

Systematic RLNC

Example: example/systematic/main.go

I start by seeding random number generator with device's nanosecond precision time

46| rand.Seed(time.Now().UnixNano())

I define one structure for storing randomly generated values, which I serialise to JSON.

17| type Data struct {
.	FieldA uint    `json:"fieldA"`
.	FieldB float64 `json:"fieldB"`
.	FieldC bool    `json:"fieldC"`
.	FieldD []byte  `json:"fieldD"`
22| }

For filling up this structure, I invoke random data generator

48| data := randData()

I calculate SHA512 hash of JSON serialised data, which turns out to be 0x25c37651f7a567963a884ef04d7dc6df0901ab58ca28aa3eaf31097e5d9155d4 in this run.

56| hasher := sha512.New512_256()

.
.

59| log.Printf("SHA512(original): 0x%s\n", hex.EncodeToString(sum))

I decide to split serialised data into N-many pieces, each of length 8 bytes.

61| var (
62|		pieceSize uint = 1 << 3 // in bytes
63| )

65| enc, err := systematic.NewSystematicRLNCEncoderWithPieceSize(m_data, pieceSize)
66| if err != nil { /* exit */ }

So I've generated 2474 bytes of JSON serialised data, which after splitting into equal sized byte slices ( read original pieces ), I get 310 pieces --- pieces which are to be coded together. It requires me to append 6 empty bytes --- 8 x 310 - 6 = 2480 - 6 = 2474 bytes

Systematic encoder also informs me, I need to consume 98580 bytes of coded data to construct original pieces i.e. original JSON serialised data.

I simulate some pieces collected, while some are dropped

75| dec := systematic.NewSystematicRLNCDecoder(enc.PieceCount())
76| for {
	c_piece := enc.CodedPiece()

	// simulating piece drop/ loss
	if rand.Intn(2) == 0 {
		continue
	}

	err := dec.AddPiece(c_piece)
    ...
88| }

Note: As these pieces are coded using systematic encoder, first N-many pieces ( here N = 310 ), are kept uncoded though they're augmented to be coded by providing with coding vector which has only one non-zero element. Next coded pieces i.e. >310th carry randomly generated coding vectors as usual.

I'm able to recover 2480 bytes of serialised data, but notice, padding is counted. So I strip out last 6 padding bytes, which results into 2474 bytes of serialised data. Computing SHA512 on recovered data must produce same hash as found with original data.

And it's indeed same hash 0x25c37651f7a567963a884ef04d7dc6df0901ab58ca28aa3eaf31097e5d9155d4 --- asserting reconstructed data is same as original data, when padding bytes stripped out.

example

For running this example

pushd example/systematic
go run main.go
popd

Note: Your console log is probably going to be different than mine.

More schemes coming soon !

Documentation ¶

Index ¶

Variables
type CodedPiece
- func CodedPiecesForRecoding(data []byte, pieceCount uint, piecesCodedTogether uint) ([]*CodedPiece, error)
type CodingVector
- func GenerateCodingVector(n uint) CodingVector
type Piece
- func OriginalPiecesFromDataAndPieceCount(data []byte, pieceCount uint) ([]Piece, uint, error)
- func OriginalPiecesFromDataAndPieceSize(data []byte, pieceSize uint) ([]Piece, uint, error)
- func (p *Piece) Multiply(piece Piece, by byte, field *galoisfield.GF)

Constants ¶

This section is empty.

Variables ¶

View Source

var (
	ErrMatrixDimensionMismatch           = errors.New("can't perform matrix multiplication")
	ErrAllUsefulPiecesReceived           = errors.New("no more pieces required for decoding")
	ErrMoreUsefulPiecesRequired          = errors.New("not enough pieces received yet to decode")
	ErrCopyFailedDuringPieceConstruction = errors.New("failed to copy whole data before splitting into pieces")
	ErrPieceCountMoreThanTotalBytes      = errors.New("requested piece count > total bytes of original data")
	ErrZeroPieceSize                     = errors.New("pieces can't be sized as zero byte")
	ErrBadPieceCount                     = errors.New("minimum 2 pieces required for RLNC")
	ErrCodedDataLengthMismatch           = errors.New("coded data length != coded piece count x coded piece length")
	ErrCodingVectorLengthMismatch        = errors.New("coding vector length > coded piece length ( in total )")
	ErrPieceNotDecodedYet                = errors.New("piece not decoded yet, more pieces required")
	ErrPieceOutOfBound                   = errors.New("requested piece index >= pieceCount ( pieces coded together )")
)

Functions ¶

This section is empty.

Types ¶

type CodedPiece ¶

type CodedPiece struct {
	Vector CodingVector
	Piece  Piece
}

Coded piece along with randomly generated coding vector to be used by recoder/ decoder

func CodedPiecesForRecoding ¶

func CodedPiecesForRecoding(data []byte, pieceCount uint, piecesCodedTogether uint) ([]*CodedPiece, error)

Before recoding can be performed, coded pieces byte array i.e. []<< coding vector ++ coded piece >> where each coded piece is << coding vector ++ coded piece >> ( flattened ) is splitted into structured data i.e. into components {coding vector, coded piece}, where how many coded pieces are present in byte array ( read `data` ) & how many pieces are coded together ( read coding vector length ) are provided

func (*CodedPiece) Flatten ¶

func (c *CodedPiece) Flatten() []byte

Flattens coded piece into single byte slice ( vector ++ piece ), so that decoding steps can be performed -- rref on received data matrix

func (*CodedPiece) IsSystematic ¶ added in v0.2.1

func (c *CodedPiece) IsSystematic() bool

Returns true if finds this piece is coded systematically i.e. piece is actually uncoded, just being augmented that it's coded which is why coding vector has only one non-zero element ( 1 )

func (*CodedPiece) Len ¶ added in v0.2.1

func (c *CodedPiece) Len() uint

Total length of coded piece --- len(coding_vector) + len(piece)

type CodingVector ¶

type CodingVector []byte

One component of coded piece; holding information regarding how original pieces are combined together

func GenerateCodingVector ¶

func GenerateCodingVector(n uint) CodingVector

Generates random coding vector of specified length

No specific randomization choice is made, default available source is used

type Piece ¶

type Piece []byte

A piece of data is nothing but a byte array

func OriginalPiecesFromDataAndPieceCount ¶

func OriginalPiecesFromDataAndPieceCount(data []byte, pieceCount uint) ([]Piece, uint, error)

When you want to split whole data chunk into N-many original pieces, this function will do it, while appending extra zero bytes ( read padding bytes ) at end of last piece if exact division is not feasible

func OriginalPiecesFromDataAndPieceSize ¶

func OriginalPiecesFromDataAndPieceSize(data []byte, pieceSize uint) ([]Piece, uint, error)

Given whole chunk of data & desired size of each pieces ( in terms of bytes ), it'll split chunk into pieces, which are to be used by encoder for performing RLNC

In case whole data chunk can't be properly divided into pieces of requested size, extra zero bytes may be appended at end, considered as padding bytes --- given that each piece must be of same size

func (*Piece) Multiply ¶

func (p *Piece) Multiply(piece Piece, by byte, field *galoisfield.GF)

Multiple pieces are coded together by performing symbol by symbol finite field arithmetic, where a single byte is a symbol

`by` is coding coefficient

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
example
full
systematic
full
matrix
systematic

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL