xxhash3

package
v0.0.0-...-d3f65ed Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 4, 2024 License: Apache-2.0 Imports: 5 Imported by: 0

README

XXH3 hash algorithm

A Go implementation of the 64/128 bit xxh3 algorithm, added the SIMD vector instruction set: AVX2 and SSE2 support to accelerate the hash processing.
The original repository can be found here: https://github.com/Cyan4973/xxHash.

Overview

For the input length larger than 240, the 64-bit version of xxh3 algorithm goes along with following steps to get the hash result.

step1. Initialize 8 accumulators used to store the middle result of each iterator.
xacc[0] = prime32_3
xacc[1] = prime64_1
xacc[2] = prime64_2
xacc[3] = prime64_3
xacc[4] = prime64_4
xacc[5] = prime32_2
xacc[6] = prime64_5
xacc[7] = prime32_1
step2. Process 1024 bytes of input data as one block each time
while remaining_length > 1024{
    for i:=0, j:=0; i < 1024; i += 64, j+=8 {
        for n:=0; n<8; n++{
            inputN := input[i+8*n:i+8*n+8]
            secretN := inputN ^ secret[j+8*n:j+8*n+8]
            
            xacc[n^1] += inputN
            xacc[n]   +=  (secretN & 0xFFFFFFFF) * (secretN >> 32)
        }
    }
    
    xacc[n]   ^= xacc[n] >> 47
    xacc[n]   ^= secret[128+8*n:128+8*n:+8]
    xacc[n]   *= prime32_1
    
    remaining_length -= 1024
}
step3. Process remaining stripes (totally 1024 bytes at most)

for i:=0, j:=0; i < remaining_length; i += 64, j+=8 {
    for n:=0; n<8; n++{
        inputN := input[i+8*n:i+8*n+8]
        secretN := inputN ^ secret[j+8*n:j+8*n+8]
    
        xacc[n^1] += inputN
        xacc[n]   += (secretN & 0xFFFFFFFF) * (secretN >> 32)
    }

    remaining_length -= 64
}
step4. Process last stripe (align to last 64 bytes)
for n:=0; n<8; n++{
    inputN := input[(length-64): (length-64)+8]
    secretN := inputN ^ secret[121+8*n, 121+8*n+8]

    xacc[n^1] += inputN
    xacc[n]   += (secretN & 0xFFFFFFFF) * (secretN >> 32)
}
step5. Merge & Avalanche accumulators
acc = length * prime64_1
acc += mix(xacc[0]^secret11, xacc[1]^secret19) + mix(xacc[2]^secret27, xacc[3]^secret35) +
    mix(xacc[4]^secret43, xacc[5]^secret51) + mix(xacc[6]^secret59, xacc[7]^secret67)

acc ^= acc >> 37
acc *= 0x165667919e3779f9
acc ^= acc >> 32

If the input data size is not larger than 240 bytes, the calculating steps are similar to the above description. The major difference lies in the data alignment. In the case of smaller input, the alignment size is 16 bytes.

Quickstart

The SIMD assembly file can be generated by the following command:

cd internal/avo && ./build.sh

Use Hash functions in your code:

package main

import "gitee.com/menciis/gkit/sys/xxhash3"

func main() {
	println(xxhash3.HashString("hello world!"))
	println(xxhash3.Hash128String("hello world!"))
}

Benchmark

go version: go1.15.10 linux/amd64
CPU: Intel(R) Core(TM) i7-10700K CPU @ 3.80GHz
OS: Linux bluehorse 5.8.0-48-generic #54~20.04.1-Ubuntu SMP
MEMORY: 32G

go test -run=None -bench=. -benchtime=1000x -count=10 > 1000_10.txt && benchstat 1000_10.txt
name                               time/op
Default/Len0_16/Target64-4         88.6ns ± 0%
Default/Len0_16/Target128-4        176ns ± 0%
Default/Len17_128/Target64-4       1.07µs ± 2%
Default/Len17_128/Target128-4      1.76µs ± 1%
Default/Len129_240/Target64-4      1.89µs ± 2%
Default/Len129_240/Target128-4     2.82µs ± 3%
Default/Len241_1024/Target64-4     47.9µs ± 0%
Default/Len241_1024/Target128-4    52.8µs ± 1%
Default/Scalar/Target64-4          3.52ms ± 2%
Default/Scalar/Target128-4         3.52ms ± 1%
Default/AVX2/Target64-4            1.93ms ± 2%
Default/AVX2/Target128-4           1.91ms ± 1%
Default/SSE2/Target64-4            2.61ms ± 2%
Default/SSE2/Target128-4           2.63ms ± 4%

Documentation

Overview

Package xxhash3 implements https://github.com/Cyan4973/xxHash/blob/dev/xxhash.h

Package xxhash3 implements https://github.com/Cyan4973/xxHash/blob/dev/xxhash.h

Package xxhash3 implements https://github.com/Cyan4973/xxHash/blob/dev/xxhash.h

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Hash

func Hash(data []byte) uint64

Hash returns the hash value of the byte slice in 64bits.

func Hash128

func Hash128(data []byte) [2]uint64

Hash128 returns the hash value of the byte slice in 128bits.

func Hash128String

func Hash128String(s string) [2]uint64

Hash128String returns the hash value of the string in 128bits.

func HashString

func HashString(s string) uint64

HashString returns the hash value of the string in 64bits.

Types

This section is empty.

Directories

Path Synopsis
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL