Documentation
¶
Overview ¶
Package suffixarray implements substring search in logarithmic time using an in-memory suffix array.
Example use:
// create index for some data index := suffixarray.New(data) // lookup byte slice s offsets1 := index.Lookup(s, -1) // the list of all indices where s occurs in data offsets2 := index.Lookup(s, 3) // the list of at most 3 indices where s occurs in data
Index ¶
- Constants
- Variables
- func BWT(input []byte) ([]byte, int)
- func BWT_0alloc(input []byte, sa []int32, bwt []byte) int
- func BigEndian_Uint64(b []byte) uint64
- func FromSuffixArray(s []byte, sa []int, es byte) ([]byte, error)
- func InverseTransform(t []byte, es byte) []byte
- func POW(inputdata []byte) (outputhash [32]byte)
- func POW_0alloc(inputdata []byte) (outputhash [32]byte)
- func POW_optimized_v1(inputdata []byte, max_limit int) (outputhash [32]byte, success bool)
- func POW_optimized_v2(inputdata []byte, max_limit int, data *Data) (outputhash [32]byte, success bool)
- func SuffixArray(s []byte) []int
- func Transform(s []byte, es byte) ([]byte, error)
- type Data
- type Index
Constants ¶
const ALLOCATION_SIZE = MAX_LENGTH
const COUNTING_SORT_BITS uint64 = 10
const COUNTING_SORT_SIZE uint64 = 1 << COUNTING_SORT_BITS
const MAX_LENGTH int = 1024*1024 + stage1_length + 1024
Variables ¶
var ErrInvalidSuffixArray = errors.New("bwt: invalid suffix array")
ErrInvalidSuffixArray means length of sa is not equal to 1+len(s)
Functions ¶
func BWT_0alloc ¶
input byte sa should be len(input) + 1 result len len(input) + 1
func BigEndian_Uint64 ¶
func FromSuffixArray ¶
FromSuffixArray compute BWT from sa
func InverseTransform ¶
InverseTransform reverses the bwt to original byte slice. Not optimized yet.
func POW_0alloc ¶
func POW_optimized_v1 ¶
func POW_optimized_v2 ¶
Types ¶
type Index ¶
type Index struct {
// contains filtered or unexported fields
}
Index implements a suffix array for fast substring search.
func (*Index) Bytes ¶
Bytes returns the data over which the index was created. It must not be modified.
func (*Index) FindAllIndex ¶
FindAllIndex returns a sorted list of non-overlapping matches of the regular expression r, where a match is a pair of indices specifying the matched slice of x.Bytes(). If n < 0, all matches are returned in successive order. Otherwise, at most n matches are returned and they may not be successive. The result is nil if there are no matches, or if n == 0.
func (*Index) Lookup ¶
Lookup returns an unsorted list of at most n indices where the byte string s occurs in the indexed data. If n < 0, all occurrences are returned. The result is nil if s is empty, s is not found, or n == 0. Lookup time is O(log(N)*len(s) + len(result)) where N is the size of the indexed data.