collatejson

package

v0.0.0-...-d8c7374 Latest Latest Go to latest Published: Aug 24, 2017 License: Apache-2.0, Apache-2.0 Imports: 10 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/abhi-bit/indexing

Links

Open Source Insights

README ¶

Collatejson library, written in golang, provides encoding and decoding function
to transform JSON text into binary representation without loosing information.
That is,

* binary representation should preserve the sort order such that, sorting
  binary encoded json documents much match sorting by functions that parse
  and compare JSON documents.
* it must be possible to get back the original document, in semantically
  correct form, from its binary representation.

Notes:

* items in a property object are sorted by its property name before they
  are compared with other property object.

for api documentation and bench marking try,

.. code-block:: bash

    godoc github.com/couchbaselabs/go-collatejson | less
    cd go-collatejson
    go test -test.bench=.

to measure relative difference in sorting 100K elements using encoding/json
library and this library try,

.. code-block:: bash

    go test -test.bench=Sort

examples/* contains reference sort ordering for different json elements.

For known issues refer to `TODO.rst`

Documentation ¶

Overview ¶

Package collatejson supplies Encoding and Decoding function to transform JSON text into binary representation without loosing information. That is,

binary representation should preserve the sort order such that, sorting binary encoded json documents much match sorting by functions that parse and compare JSON documents.
it must be possible to get back the original document, in semantically correct form, from its binary representation.

Notes:

items in a property object are sorted by its property name before they are compared with property's value.

Index ¶

Constants
Variables
func DecodeFloat(code, text []byte) []byte
func DecodeInt(code, text []byte) (int, []byte)
func DecodeLD(code, text []byte) []byte
func DecodeSD(code, text []byte) []byte
func EncodeFloat(text, code []byte) []byte
func EncodeInt(text, code []byte) []byte
func EncodeLD(text, code []byte) []byte
func EncodeSD(text, code []byte) []byte
type Codec
- func NewCodec(propSize int) *Codec
type Integer
- func (i *Integer) ConvertToScientificNotation(val int64) (string, error)
- func (i *Integer) TryConvertFromScientificNotation(val []byte) (ret []byte)
type Length
type Missing
- func (m Missing) Equal(n string) bool

Constants ¶

View Source

const (
	PLUS  = 43
	MINUS = 45
	LT    = 60
	GT    = 62
	DOT   = 46
	ZERO  = 48
)

Constants used in text representation of basic data types.

View Source

const (
	Terminator byte = iota
	TypeMissing
	TypeNull
	TypeFalse
	TypeTrue
	TypeNumber
	TypeString
	TypeLength
	TypeArray
	TypeObj
)

While encoding JSON data-element, both basic and composite, encoded string is prefixed with a type-byte. `Terminator` terminates encoded datum.

View Source

const MinBufferSize = 16

MinBufferSize for target buffer to encode or decode.

View Source

const MissingLiteral = Missing("~[]{}falsenilNA~")

MissingLiteral is special string to denote missing item. IMPORTANT: we are assuming that MissingLiteral will not occur in the keyspace.

Variables ¶

View Source

var ErrLenPrefixUnsupported = errors.New("arrayLenPrefix is unsupported")

View Source

var ErrNotAnArray = errors.New("not an array")

View Source

var ErrorNumberType = errors.New("collatejson.numberType")

ErrorNumberType means configured number type is not supported by codec.

View Source

var ErrorOutputLen = errors.New("collatejson.outputLen")

ErrorOutputLen means output buffer has insufficient length.

View Source

var ErrorSuffixDecoding = errors.New("collatejson.suffixDecoding")

error codes

Functions ¶

func DecodeFloat ¶

func DecodeFloat(code, text []byte) []byte

DecodeFloat complements EncodeFloat, it returns `exponent` and `mantissa` in text format.

func DecodeInt ¶

func DecodeInt(code, text []byte) (int, []byte)

DecodeInt complements EncodeInt, it returns integer in text that can be converted to integer value using strconv.AtoI(return_value)

func DecodeLD ¶

func DecodeLD(code, text []byte) []byte

DecodeLD complements EncodeLD, it returns integer in text that can be converted to integer type using strconv.ParseFloat(return_value, 64).

func DecodeSD ¶

func DecodeSD(code, text []byte) []byte

DecodeSD complements EncodeSD, it returns integer in text that can be converted to integer type using strconv.ParseFloat(return_value, 64).

func EncodeFloat ¶

func EncodeFloat(text, code []byte) []byte

EncodeFloat encodes floating point number such that their natural order is preserved as lexicographic order of their representation. Additionally it must be possible to get back the natural representation from its lexical representation.

A floating point number f takes a mantissa m ∈ [1/10 , 1) and an integer exponent e such that f = (10^e) * ±m.

encoding −0.1 × 10^11    - --7888+
encoding −0.1 × 10^10    - --7898+
encoding -1.4            - -885+
encoding -1.3            - -886+
encoding -1              - -88+
encoding -0.123          - 0876+
encoding -0.0123         - +1876+
encoding -0.001233       - +28766+
encoding -0.00123        - +2876+
encoding 0               0
encoding +0.00123        + -7123-
encoding +0.001233       + -71233-
encoding +0.0123         + -8123-
encoding +0.123          + 0123-
encoding +1              + +11-
encoding +1.3            + +113-
encoding +1.4            + +114-
encoding +0.1 × 10^10    + ++2101-
encoding +0.1 × 10^11    + ++2111-

func EncodeInt ¶

func EncodeInt(text, code []byte) []byte

EncodeInt encodes integer such that their natural order is preserved as a lexicographic order of their representation. Additionally it must be possible to get back the natural representation from its lexical representation.

Input `text` is also in textual representation, that is, strconv.Atoi(text) is the actual integer that is encoded.

Zero is encoded as '0'

func EncodeLD ¶

func EncodeLD(text, code []byte) []byte

EncodeLD encodes large-decimal, values that are greater than or equal to +1.0 and less than or equal to -1.0, such that their natural order is preserved as a lexicographic order of their representation. Additionally it must be possible to get back the natural representation from its lexical representation.

Input `text` is also in textual representation, that is, strconv.ParseFloat(text, 64) is the actual integer that is encoded.

encoding -100.5         --68994>
encoding -10.5          --7>
encoding -3.145         -3854>
encoding -3.14          -385>
encoding -1.01          -198>
encoding -1             -1>
encoding -0.0001233     -09998766>
encoding -0.000123      -0999876>
encoding +0.000123      >0000123-
encoding +0.0001233     >00001233-
encoding +1             >1-
encoding +1.01          >101-
encoding +3.14          >314-
encoding +3.145         >3145-
encoding +10.5          >>2105-
encoding +100.5         >>31005-

func EncodeSD ¶

func EncodeSD(text, code []byte) []byte

EncodeSD encodes small-decimal, values that are greater than -1.0 and less than +1.0,such that their natural order is preserved as lexicographic order of their representation. Additionally it must be possible to get back the natural representation from its lexical representation.

Small decimals is greater than -1.0 and less than 1.0 ¶

Input `text` is also in textual representation, that is, strconv.ParseFloat(text, 64) is the actual integer that is encoded.

encoding -0.9995    -0004>
encoding -0.999     -000>
encoding -0.0123    -9876>
encoding -0.00123   -99876>
encoding -0.0001233 -9998766>
encoding -0.000123  -999876>
encoding +0.000123  >000123-
encoding +0.0001233 >0001233-
encoding +0.00123   >00123-
encoding +0.0123    >0123-
encoding +0.999     >999-
encoding +0.9995    >9995-

Caveats:

-0.0, 0.0 and +0.0 must be filtered out as integer ZERO `0`.

Types ¶

type Codec ¶

type Codec struct {
	// contains filtered or unexported fields
}

Codec structure

func NewCodec ¶

func NewCodec(propSize int) *Codec

NewCodec creates a new codec object and returns a reference to it.

func (*Codec) Decode ¶

func (codec *Codec) Decode(code, text []byte) ([]byte, error)

Decode a slice of byte into json string and return them as slice of byte. `text` is the output buffer for decoding and expected to have enough capacity, atleast 3x of input `code` and > MinBufferSize.

func (*Codec) Encode ¶

func (codec *Codec) Encode(text, code []byte) ([]byte, error)

Encode json documents to order preserving binary representation. `code` is the output buffer for encoding and expected to have enough capacity, atleast 3x of input `text` and > MinBufferSize.

func (*Codec) EncodeN1QLValue ¶

func (codec *Codec) EncodeN1QLValue(val n1ql.Value, buf []byte) (bs []byte, err error)

Caller is responsible for providing sufficiently sized buffer Otherwise it may panic

func (*Codec) ExplodeArray ¶

func (codec *Codec) ExplodeArray(code []byte, tmp []byte) ([][]byte, error)

func (*Codec) JoinArray ¶

func (codec *Codec) JoinArray(vals [][]byte, code []byte) ([]byte, error)

func (*Codec) NumberType ¶

func (codec *Codec) NumberType(what string)

NumberType chooses type of encoding / decoding for JSON numbers. Can be "float64", "int64", "decimal". Default is "float64"

func (*Codec) ReverseCollate ¶

func (codec *Codec) ReverseCollate(code []byte, desc []bool) []byte

ReverseCollate reverses the bits in an encoded byte stream based on the fields specified as desc. Calling reverse on an already reversed stream gives back the original stream.

func (*Codec) SortbyArrayLen ¶

func (codec *Codec) SortbyArrayLen(what bool)

SortbyArrayLen sorts array by length before sorting by array elements. Use `false` to sort only by array elements. Default is `true`.

func (*Codec) SortbyPropertyLen ¶

func (codec *Codec) SortbyPropertyLen(what bool)

SortbyPropertyLen sorts property by length before sorting by property items. Use `false` to sort only by proprety items. Default is `true`.

func (*Codec) UseMissing ¶

func (codec *Codec) UseMissing(what bool)

UseMissing will interpret special string MissingLiteral and encode them as TypeMissing. Default is `true`.

type Integer ¶

type Integer struct{}

func (*Integer) ConvertToScientificNotation ¶

func (i *Integer) ConvertToScientificNotation(val int64) (string, error)

Formats an int64 to scientic notation. Example: 75284 converts to 7.5284e+04 1200000 converts to 1.200000e+06 -612988654 converts to -6.12988654e+08 This is used in encode path

func (*Integer) TryConvertFromScientificNotation ¶

func (i *Integer) TryConvertFromScientificNotation(val []byte) (ret []byte)

If float, return e notation If integer, convert from e notation to standard notation This is used in decode path

type Length ¶

type Length int64

Length is an internal type used for prefixing length of arrays and properties.

type Missing ¶

type Missing string

Missing denotes a special type for an item that evaluates to _nothing_.

func (Missing) Equal ¶

func (m Missing) Equal(n string) bool

Equal checks wether n is MissingLiteral

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
tools
checkfiles
validate

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL