README ¶
Hasher
Hasher is a tool to automate the creation of methods and tables for a string → uint32 mapper. It uses the fact that all keys are known apriori, allowing it to generate a very efficient hashtable. It has been built to work with go generate
. New keys can be added by appending more constants to the list and rerunning hasher
. The new keys will be assigned new ID's. Running hasher
changes all ID's, so do not store them in a file or database!
It is really a mix of https://github.com/golang/tools/blob/master/cmd/stringer and https://github.com/golang/net/tree/master/html/atom with some customization.
For example, given this snippet,
package painkiller
type Pill uint32
const (
Placebo Pill = iota
Aspirin
Ibuprofen
Paracetamol
Acid // lysergic-acid-diethlamide
)
running this command
hasher -type=Pill -file=pill.go
in the same directory will OVERWRITE the same file with the constant list itself (except with different values) and add tables, hashes and conversion functions between string
and uint32
. The output file is thus also the input file, and the tool can run over it indefinitely! This outputs
package painkiller
// Pill defines perfect hashes for a predefined list of strings
type Pill uint32
// Unique hash definitions to be used instead of strings
const (
Aspirin Pill = 0x7 // aspirin
Ibuprofen Pill = 0x709 // ibuprofen
Acid Pill = 0x1a19 // lysergic-acid-diethlamide
Paracetamol Pill = 0x100b // paracetamol
Placebo Pill = 0x3307 // placebo
)
func (i Pill) String() string { /* ... */ }
func ToPill(s []byte) Pill { /* ... */ }
/* ... */
Amending
To add more keys to the table, just add more constants to the file:
// ...
Aspirin Pill = 0x7
Ibuprofen Pill = 0x709
Paracetamol Pill = 0x100b
Placebo Pill = 0x1b07
NewPillA
NewPillB
// ...
or update the constant name or the comment value and rerun hasher
!
Installation
Run the following command
go get github.com/tdewolff/hasher
Usage
Typically this process would be run using go generate, like this:
//go:generate hasher -type=Pill -file=pill.go
but hasher
adds that row itself too, so you can use the command-line just as well hasher -type=Pill -file=pill.go
. Just make sure the file consists solely of the new type and the constants of that type, because it will be overwritten!
It must have the -type and -file flag, that accepts a single type and filename respectively.
Lower- and uppercase and dashes
By default, the first uppercase of the constants is lowered and any uppercase after an underscore. Any underscore is replace by a dash. So that:
fmt.Print(painkiller.Aspirin) // aspirin
fmt.Print(painkiller.StrongMorphine) // strongMorphine
fmt.Print(painkiller.Amitriptyline_Gabapentin) // amitriptyline-gabapentin
But you can specify the name explicitly in the comment after the constant. See the initial example.
Hash to string
Translate the value of a Pill constant to the string representation of the respective constant name, so that the call fmt.Print(painkiller.Aspirin)
will print the string "aspirin"
.
func (Pill) String() string
String to hash
Translate a string to a value of a Pill constant with the same name. So that ToPill("aspirin") == painkiller.Aspirin
. This function is called "To" + typeName
so it changes name when the type name changes.
func ToPill(s []byte) Pill
Performance
The tool might be slow for large batches of constants, because it can have difficulty finding a hash that minimizes the table. For example, the parse.Html hash.go takes about 10 seconds.
The performance of lookups depends on the use case. Because a call to ToHash
always needs to check the result, it is slower than a direct string comparison and is O(n)
. However, comparison of a hash (int) is very fast and O(1)
.
License
Released under the BSD license.