fastdiv

package module
v0.0.0-...-41d5178 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 27, 2019 License: MIT Imports: 1 Imported by: 5

README

fastdiv

GoDoc

Fast division, modulus and divisibility checks for divisors known only at runtime via the method of:

"Faster Remainder by Direct Computation: Applications to Compilers and Software Libraries" Daniel Lemire, Owen Kaser, Nathan Kurz arXiv:1902.01961

Usage:

var divisor uint32 = 3

// intialize a divisor at runtime
d := fastdiv.NewUint32(divisor)

// use it repeatedly
var total uint32
for i := uint32(1); i < 10; i++ {
	total += d.Div(i)
	if d.Divisible(i) {
		fmt.Printf("%d is divisible by %d\n", i, divisor)
	}
}
fmt.Printf("Sum of quotients = %d", total)

// Output:
// 3 is divisible by 3
// 6 is divisible by 3
// 9 is divisible by 3
// Sum of quotients = 12

The method works by pre-computing an approximate inverse of the divisor such that the quotient is given by the high part of the multiplication and the remainder can be calculated by multiplying the fraction contained in the low part by the original divisor. In general, the required accuracy for the approximate inverse is twice the width of the original divisor. For divisors that are half the width of a register or less, this means that the quotient can be calculated with one high-multiplication (top word of a full-width multiplication), the remainder can be calculated with one low-multiplication followed by a high-multiplication and both can be calculated with one full-width multiplication and one high-multiplication.

On amd64 architecture for divisors that are 32-bits or less, this method can be faster than the traditional Granlund-Montgomery-Warren approach used to optimize constant divisions in the compiler. The requirement that the approximate inverse be twice the divisor width means that extended arithmetic is required for 64-bit divisors. The extended arithmetic makes this method is somewhat slower than the Granlund-Montgomery-Warren approach for these larger divisors, but still faster than 64-bit division instructions.

The per operation speed up over a division instruction is ~2-3x and the overhead of pre-computing the inverse can be amortized after 1-6 repeated divisions with the same divisor.

op size var const fastdiv var / fastdiv # to breakeven
div uint16 15.5 ns 5.46 ns 5.32 ns 2.9x 1
mod uint16 16.2 ns 7.68 ns 7.68 ns 2.1x 1
div int16 16.7 ns 5.91 ns 6.55 ns 2.5x 1
mod int16 17.5 ns 8.86 ns 8.86 ns 2.0x 1
div uint32 15.5 ns 7.28 ns 5.53 ns 2.8x 2
mod uint32 15.7 ns 9.63 ns 7.09 ns 2.2x 2
div int32 16.0 ns 5.91 ns 5.91 ns 2.7x 2
mod int32 16.1 ns 8.86 ns 8.27 ns 1.9x 2
div uint64 21.4 ns 5.91 ns 6.89 ns 3.1x 5
mod uint64 20.3 ns 8.30 ns 8.87 ns 2.3x 6
div int64 26.2 ns 7.26 ns 8.51 ns 3.0x 5
mod int64 25.8 ns 9.57 ns 16.8 ns 1.5x 6

Documentation

Overview

Package fastdiv implements fast division, modulus and divisibility checks for divisors known only at runtime via the method of:

"Faster Remainder by Direct Computation: Applications to Compilers and Software Libraries" Daniel Lemire, Owen Kaser, Nathan Kurz https://arxiv.org/abs/1902.01961

The method works by pre-computing an approximate inverse of the divisor such that the quotient is given by the high part of the multiplication and the remainder can be calculated by multiplying the fraction contained in the low part by the original divisor. In general, the required accuracy for the approximate inverse is twice the width of the original divisor. For divisors that are half the width of a register or less, this means that the quotient can be calculated with one high-multiplication (top word of a full-width multiplication), the remainder can be calculated with one low-multiplication followed by a high-multiplication and both can be calculated with one full-width multiplication and one high-multiplication.

On amd64 architecture for divisors that are 32-bits or less, this method can be faster than the traditional Granlund-Montgomery-Warren approach used to optimize constant divisions in the compiler. The requirement that the approximate inverse be twice the divisor width means that extended arithmetic is required for 64-bit divisors. The extended arithmetic makes this method is somewhat slower than the Granlund-Montgomery-Warren approach for these larger divisors, but still faster than 64-bit division instructions.

The per operation speed up over a division instruction is ~2-3x and the overhead of pre-computing the inverse can be amortized after 1-6 repeated divisions with the same divisor.

Example
package main

import (
	"fmt"

	"github.com/bmkessler/fastdiv"
)

func main() {
	var divisor uint32 = 3

	// intialize a divisor at runtime
	d := fastdiv.NewUint32(divisor)

	// use it repeatedly
	var total uint32
	for i := uint32(1); i < 10; i++ {
		total += d.Div(i)
		if d.Divisible(i) {
			fmt.Printf("%d is divisible by %d\n", i, divisor)
		}
	}
	fmt.Printf("Sum of quotients = %d", total)

}
Output:

3 is divisible by 3
6 is divisible by 3
9 is divisible by 3
Sum of quotients = 12

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Int16

type Int16 struct {
	// contains filtered or unexported fields
}

Int16 calculates division by using a pre-computed inverse.

func NewInt16

func NewInt16(d int16) Int16

NewInt16 initializes a new pre-computed inverse for d != 0. If d == 0, a runtime divide-by-zero panic is raised.

func (Int16) Div

func (d Int16) Div(n int16) int16

Div calculates n / d using the pre-computed inverse. Note, must have d != 1, -1, 0, or math.MinInt16

func (Int16) DivMod

func (d Int16) DivMod(n int16) (q, r int16)

DivMod calculates n / d and n % d using the pre-computed inverse. Note, must have d != 1, -1, 0, or math.MinInt16

func (Int16) Mod

func (d Int16) Mod(n int16) int16

Mod calculates n % d using the pre-computed inverse.

type Int32

type Int32 struct {
	// contains filtered or unexported fields
}

Int32 calculates division by using a pre-computed inverse.

func NewInt32

func NewInt32(d int32) Int32

NewInt32 initializes a new pre-computed inverse for d != 0. If d == 0, a runtime divide-by-zero panic is raised.

func (Int32) Div

func (d Int32) Div(n int32) int32

Div calculates n / d using the pre-computed inverse. Note, must have d != 1, -1, 0, or math.MinInt32

func (Int32) DivMod

func (d Int32) DivMod(n int32) (q, r int32)

DivMod calculates n / d and n % d using the pre-computed inverse. Note, must have d != 1, -1, 0, or math.MinInt32

func (Int32) Mod

func (d Int32) Mod(n int32) int32

Mod calculates n % d using the pre-computed inverse.

type Int64

type Int64 struct {
	// contains filtered or unexported fields
}

Int64 calculates division by using a pre-computed inverse.

func NewInt64

func NewInt64(d int64) Int64

NewInt64 initializes a new pre-computed inverse.

func (Int64) Div

func (d Int64) Div(n int64) int64

Div calculates n / d using the pre-computed inverse. Note, must have d != 1, -1, 0, or math.MinInt32

func (Int64) DivMod

func (d Int64) DivMod(n int64) (q, r int64)

DivMod calculates n / d and n % d using the pre-computed inverse. Note, must have d != 1, -1, 0, or math.MinInt32

func (Int64) Mod

func (d Int64) Mod(n int64) int64

Mod calculates n % d using the pre-computed inverse.

type Uint16

type Uint16 struct {
	// contains filtered or unexported fields
}

Uint16 calculates division by using a pre-computed inverse.

func NewUint16

func NewUint16(d uint16) Uint16

NewUint16 initializes a new pre-computed inverse for d != 0. If d == 0, a runtime divide-by-zero panic is raised.

func (Uint16) Div

func (d Uint16) Div(n uint16) uint16

Div calculates n / d using the pre-computed inverse. Note must have d > 1.

func (Uint16) DivMod

func (d Uint16) DivMod(n uint16) (uint16, uint16)

DivMod calculates n / d and n % d using the pre-computed inverse. Note must have d > 1.

func (Uint16) Divisible

func (d Uint16) Divisible(n uint16) bool

Divisible determines whether n is exactly divisible by d using the pre-computed inverse.

func (Uint16) Mod

func (d Uint16) Mod(n uint16) uint16

Mod calculates n % d using the pre-computed inverse.

type Uint32

type Uint32 struct {
	// contains filtered or unexported fields
}

Uint32 calculates division by using a pre-computed inverse.

func NewUint32

func NewUint32(d uint32) Uint32

NewUint32 initializes a new pre-computed inverse for d != 0. If d == 0, a runtime divide-by-zero panic is raised.

func (Uint32) Div

func (d Uint32) Div(n uint32) uint32

Div calculates n / d using the pre-computed inverse. Note must have d > 1.

func (Uint32) DivMod

func (d Uint32) DivMod(n uint32) (uint32, uint32)

DivMod calculates n / d and n % d using the pre-computed inverse. Note must have d > 1.

func (Uint32) Divisible

func (d Uint32) Divisible(n uint32) bool

Divisible determines whether n is exactly divisible by d using the pre-computed inverse.

func (Uint32) Mod

func (d Uint32) Mod(n uint32) uint32

Mod calculates n % d using the pre-computed inverse.

type Uint64

type Uint64 struct {
	// contains filtered or unexported fields
}

Uint64 calculates division by using a pre-computed inverse.

func NewUint64

func NewUint64(d uint64) Uint64

NewUint64 initializes a new pre-computed inverse.

func (Uint64) Div

func (d Uint64) Div(n uint64) uint64

Div calculates n / d using the pre-computed inverse.

func (Uint64) DivMod

func (d Uint64) DivMod(n uint64) (q, r uint64)

DivMod calculates n / d and n % d using the pre-computed inverse. Note must have d > 1.

func (Uint64) Divisible

func (d Uint64) Divisible(n uint64) bool

Divisible determines whether n is exactly divisible by d using the pre-computed inverse.

func (Uint64) Mod

func (d Uint64) Mod(n uint64) uint64

Mod calculates n % d using the pre-computed inverse.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL