lexnum

package module
v0.0.0-...-460eeb1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 16, 2014 License: MIT Imports: 2 Imported by: 4

README

lexnum

lexnum provides a lexicographic encoding of numbers. It is based off of the ideas presented in Peter Seymour's Efficient Lexicographic Encoding of Numbers. The paper's original source can be found at http://www.zanopha.com/docs/elen.pdf, but it is re-hosted here for posterity.

Numbers are ordered. That's part of their whole thing, what with them being numbers and all. 1 comes before 2, and 2 comes before 3, and so on, and so forth. If you sort a list of integers like [1, 9, 4, 10, 13, 31] you'll get them back in their normal ordering: [1, 4, 9, 10, 13, 31]. If each of those were to be strings, such that the input is ['1', '9', '4', '10', '13', '31'], sorting them lexically would not produce a proper numerical sorting, you would wind up with ['1', '10', '13', '31', '4', '9']. That is quite annoying when doing things like appending numbers to identical file names or basically anything where you wish to put a number in the middle of what is otherwise a string.

You could always just pad your numbers with zeroes, so the prior example would have an input more like ['01', '09', '04', '10', '13', '31']. This works fine, but it requires that you know something about your input in advance. Namely, it requires that you know the range of the input, which may not always be the case. More subtly, it also requires that you know that you're only dealing with positive integers; even with zero padding, negative numbers would lexically sort backwards (e.g., '+1' lexically precedes '+2', which makes sense because 1 is less than 2, but it's also the case that '-1' precedes '-2', which makes no sense, since -2 is less than -1). Doubly subtle is the fact that the only reason a negative number string precedes a positive number string is that the negative character precedes the zero character in the ascii table. Triply subtle is that if you just always put a sign in front of your numbers, the positive numbers will precede the negative numbers because the + character precedes the - character in ascii. The end result is that a list like ['+1', '+4', '-9', '+10', '-13', '+31'] would lexically sort to ['+1', '+10', '+31', '+4', '-13', '-9'].

For a full description of how the problem is solved, read the white paper.

example

The following program would count from -20 to 20 and print their lexnum strings:

package main

import (
    "fmt"
    "github.com/jordanorelli/lexnum"
)

func main() {
    e := lexnum.NewEncoder('=', '-')
    for i := -20; i <= 20; i++ {
        fmt.Printf("%-12s%d\n", e.EncodeInt(i), i)
    }
}

Running it would produce the following output:

--779       -20
--780       -19
--781       -18
--782       -17
--783       -16
--784       -15
--785       -14
--786       -13
--787       -12
--788       -11
--789       -10
-0          -9
-1          -8
-2          -7
-3          -6
-4          -5
-5          -4
-6          -3
-7          -2
-8          -1
0           0
=1          1
=2          2
=3          3
=4          4
=5          5
=6          6
=7          7
=8          8
=9          9
==210       10
==211       11
==212       12
==213       13
==214       14
==215       15
==216       16
==217       17
==218       18
==219       19
==220       20

Documentation

Overview

package lexnum provides an efficient lexicographic encoding of numbers as described here: http://www.zanopha.com/docs/elen.pdf

lexnum allows a developer to encode integers as strings, such that the strings may be lexically sorted, preserving their numerical orderings.

e.g., if one were to sort the numbers 0, 1, 2, 9, 10 lexically, we would wind up with 0, 1, 10, 2, 9. If we knew the maximum value, we could prescribe some zero-padding. Failing that, we would need a lexicographic encoding of the numbers. Lexnum attempts to provide this alternative.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Encoder

type Encoder struct {
	// contains filtered or unexported fields
}

an Encoder may be used to encode or decode an integer as a string. The produced strings will have the property that any set of numbers will have the same lexical sorting and numeric sorting.

func NewEncoder

func NewEncoder(pos rune, neg rune) *Encoder

NewEncoder creates a new lexnum Encoder. We achieve

func (Encoder) DecodeInt

func (l Encoder) DecodeInt(s string) (int, error)

Decodes a lexnum string, returning its original integer representation.

func (Encoder) EncodeInt

func (l Encoder) EncodeInt(i int) string

Encodes an integer as a string.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL