float

package module

v0.0.0-...-56010e2 Latest Latest Go to latest Published: Aug 2, 2021 License: MIT Imports: 6 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/jenska/float

Links

Open Source Insights

README ¶

80-bit IEEE 754 extended double precision floating-point library for Go

The float package is a software implementation of floating-point arithmetics that conforms to the 80-bit IEEE 754 extended double precision floating-point format

This package is derived from the original SoftFloat package and was implemented as a basis for a Motorola M68881/M68882 FPU emulation in pure Go

Example

package float_test

import (
    "fmt"
    "github.com/jenska/float"
)

func ExampleX80() {
    pi := float.X80Pi
    pi2 := pi.Add(pi)
    sqrtpi2 := pi2.Sqrt()
    epsilon := sqrtpi2.Mul(sqrtpi2).Sub(pi2)
    fmt.Println(epsilon)
    // Output: -0.000000000000000000433680868994
}

Error Handling

TODOs

improve test coverage
add examples
improve error handling
log/ln operations
atan
benchmarks

Documentation ¶

Examples ¶

X80

Constants ¶

View Source

const (
	TininessAfterRounding  = 0
	TininessBeforeRounding = 1
)

Software IEC/IEEE floating-point underflow tininess-detection mode.

View Source

const (
	RoundNearestEven = 0
	RoundToZero      = 1
	RoundDown        = 2
	RoundUp          = 3
)

Software IEC/IEEE floating-point rounding mode.

View Source

const (
	ExceptionInvalid   = 0x01
	ExceptionDenormal  = 0x02
	ExceptionDivbyzero = 0x04
	ExceptionOverflow  = 0x08
	ExceptionUnderflow = 0x10
	ExceptionInexact   = 0x20
)

Software IEC/IEEE floating-point exception flags.

Variables ¶

View Source

var (
	X80Zero     = newFromHexString("00000000000000000000") // 0
	X80One      = newFromHexString("3FFF8000000000000000") // 1
	X80MinusOne = newFromHexString("BFFF8000000000000000") // -1
	X80E        = newFromHexString("4000ADF85458A2BB4800") // e
	X80Pi       = newFromHexString("4000C90FDAA22168C000") // pi
	X80Sqrt2    = newFromHexString("BFFFB504F333F9DE6800") // sqrt(2)
	X80Log2E    = newFromHexString("3FFFB8AA3B295C17F000") // Log2(e)
	X80Ln2      = newFromHexString("3FFEB17217F7D1CF7800") // Ln(2)
	X80InfPos   = newFromHexString("7FFF8000000000000000") // inf+
	X80InfNeg   = newFromHexString("FFFF8000000000000000") // inf-
	X80NaN      = newFromHexString("7FFFC000000000000000") // NaN
)

"constants" fpr X80 format

View Source

var DetectTininess = TininessAfterRounding

DetectTininess tininess-detection mode.

View Source

var Exception int = 0

Exception Software IEC/IEEE floating-point exception flags.

View Source

var RoundingMode = RoundNearestEven

RoundingMode Software IEC/IEEE floating-point rounding mode.

View Source

var RoundingPrecision = 80

RoundingPrecision Software IEC/IEEE extended double-precision rounding precision. Valid values are 32, 64, and 80.

Functions ¶

func Raise ¶

func Raise(x int)

Raise any or all of the software IEC/IEEE floating-point exception flags.

Types ¶

type X80 ¶

type X80 struct {
	// contains filtered or unexported fields
}

X80 represents the 80-bit extended double precision floating-point type

Example ¶

package main

import (
	"fmt"

	"github.com/jenska/float"
)

func main() {
	pi := float.X80Pi
	pi2 := pi.Add(pi)
	sqrtpi2 := pi2.Sqrt()
	epsilon := sqrtpi2.Mul(sqrtpi2).Sub(pi2)
	fmt.Println(epsilon)
}

Output:

-0.000000000000000000433680868994

func Float32ToFloatX80 ¶

func Float32ToFloatX80(a float32) X80

Float32ToFloatX80 returns the result of converting the single-precision floating-point value `a' to the extended double-precision floating-point format. The conversion is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func Float64ToFloatX80 ¶

func Float64ToFloatX80(a float64) X80

Float64ToFloatX80 returns the result of converting the double-precision floating-point value `a' to the extended double-precision floating-point format. The conversion is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func Int32ToFloatX80 ¶

func Int32ToFloatX80(a int32) X80

Int32ToFloatX80 returns the result of converting the 32-bit two's complement integer `a' to the extended double-precision floating-point format. The conversion is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func Int64ToFloatX80 ¶

func Int64ToFloatX80(a int64) X80

Int64ToFloatX80 returns the result of converting the 64-bit two's complement integer `a' to the extended double-precision floating-point format. The conversion is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func NewFromBytes ¶

func NewFromBytes(b []byte, order binary.ByteOrder) X80

NewFromBytes returns a new extended double precision float from a byte array in byte order LittleEndian or BigEndian

func NewFromFloat64 ¶

func NewFromFloat64(a float64) X80

NewFromFloat64 returns the result of converting the double-precision floating-point value `a' to the extended double-precision floating-point format. The conversion is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func (X80) Add ¶

func (a X80) Add(b X80) X80

Add returns the result of adding the extended double-precision floating-point values `a' and `b'. The operation is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func (X80) Append ¶

func (a X80) Append(dst []byte, fmt byte, prec int) []byte

Append appends the string form of the floating-point number f, as generated by FormatFloat, to dst and returns the extended buffer.

func (X80) Bytes ¶

func (a X80) Bytes(order binary.ByteOrder) []byte

Bytes returns a byte array in byte order LittleEndian or BigEndian of an extended double precision float

func (X80) Div ¶

func (a X80) Div(b X80) X80

Div returns the result of dividing the extended double-precision floating-point value `a' by the corresponding value `b'. The operation is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func (X80) Eq ¶

func (a X80) Eq(b X80) bool

Eq returns true if the extended double-precision floating-point value `a' is equal to the corresponding value `b', and false otherwise. The comparison is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func (X80) EqSignaling ¶

func (a X80) EqSignaling(b X80) bool

EqSignaling returns true if the extended double-precision floating-point value `a' is equal to the corresponding value `b', and false otherwise. The invalid exception is raised if either operand is a NaN. Otherwise, the comparison is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func (X80) Format ¶

func (a X80) Format(fmt byte, prec int) string

Format converts the extended floating-point number f to a string, according to the format fmt. It rounds the result assuming that the original was obtained from a floating-point value of 80 bits.

The format fmt is one of 'b' (-ddddp±ddd, a binary exponent), 'e' (-d.dddde±dd, a decimal exponent), 'E' (-d.ddddE±dd, a decimal exponent), 'f' (-ddd.dddd, no exponent),

The precision prec controls the number of digits (excluding the exponent) printed by the 'e', 'E', 'f' formats. For 'e', 'E', 'f' it is the number of digits after the decimal point.

func (X80) Ge ¶

func (a X80) Ge(b X80) bool

Ge returns true if the extended double-precision floating-point value `a' is greater than or equal to the corresponding value `b', and false otherwise.

func (X80) GeQuiet ¶

func (a X80) GeQuiet(b X80) bool

GeQuiet returns true if the extended double-precision floating-point value `a' is greater than or equal to the corresponding value `b', and false otherwise. Quiet NaNs do not cause an exception. Otherwise, the comparison is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func (X80) Gt ¶

func (a X80) Gt(b X80) bool

Gt returns true if the extended double-precision floating-point value `a' is greater than the corresponding value `b', and false otherwise.

func (X80) GtQuiet ¶

func (a X80) GtQuiet(b X80) bool

GtQuiet returns true if the extended double-precision floating-point value `a' is greater than the corresponding value `b', and false otherwise. Quiet NaNs do not cause an exception. Otherwise, the comparison is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func (X80) Internal ¶

func (a X80) Internal() string

Internal returns the internal represantion of the 80bit float value in hex format.

func (X80) IsNaN ¶

func (a X80) IsNaN() bool

IsNaN returns true if the value is NaN, otherwise false

func (X80) IsSignalingNaN ¶

func (a X80) IsSignalingNaN() bool

IsSignalingNaN returns true of the value is a signaling NaN, otherwise false

func (X80) Le ¶

func (a X80) Le(b X80) bool

Le returns true if the extended double-precision floating-point value `a' is less than or equal to the corresponding value `b', and false otherwise.

func (X80) LeQuiet ¶

func (a X80) LeQuiet(b X80) bool

LeQuiet returns true if the extended double-precision floating-point value `a' is less than or equal to the corresponding value `b', and false otherwise. Quiet NaNs do not cause an exception. Otherwise, the comparison is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func (X80) Lt ¶

func (a X80) Lt(b X80) bool

Lt returns true if the extended double-precision floating-point value `a' is less than the corresponding value `b', and false otherwise. The comparison is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func (X80) LtQuiet ¶

func (a X80) LtQuiet(b X80) bool

LtQuiet returns true if the extended double-precision floating-point value `a' is less than the corresponding value `b', and false otherwise. Quiet NaNs do not cause an exception. Otherwise, the comparison is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func (X80) Mul ¶

func (a X80) Mul(b X80) X80

Mul returns the result of multiplying the extended double-precision floating- point values `a' and `b'. The operation is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func (X80) Rem ¶

func (a X80) Rem(b X80) X80

Rem returns the remainder of the extended double-precision floating-point value `a' with respect to the corresponding value `b'. The operation is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func (X80) RoundToInt ¶

func (a X80) RoundToInt() X80

RoundToInt rounds the extended double-precision floating-point value `a' to an integer, and returns the result as an extended quadruple-precision floating-point value. The operation is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func (X80) Sqrt ¶

func (a X80) Sqrt() X80

Sqrt returns the square root of the extended double-precision floating-point value `a'. The operation is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func (X80) String ¶

func (a X80) String() string

func (X80) Sub ¶

func (a X80) Sub(b X80) X80

Sub returns the result of subtracting the extended double-precision floating- point values `a' and `b'. The operation is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func (X80) ToFloat32 ¶

func (a X80) ToFloat32() float32

ToFloat32 returns the result of converting the extended double-precision floating- point value `a' to the double-precision floating-point format. The conversion is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func (X80) ToFloat64 ¶

func (a X80) ToFloat64() float64

ToFloat64 returns the result of converting the extended double-precision floating- point value `a' to the double-precision floating-point format. The conversion is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

func (X80) ToInt32 ¶

func (a X80) ToInt32() int32

ToInt32 returns the result of converting the extended double-precision floating- point value `a' to the 32-bit two's complement integer format. The conversion is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic---which means in particular that the conversion is rounded according to the current rounding mode. If `a' is a NaN, the largest positive integer is returned. Otherwise, if the conversion overflows, the largest integer with the same sign as `a' is returned.

func (X80) ToInt32RoundZero ¶

func (a X80) ToInt32RoundZero() int32

ToInt32RoundZero returns the result of converting the extended double-precision floating- point value `a' to the 32-bit two's complement integer format. The conversion is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic, except that the conversion is always rounded toward zero. If `a' is a NaN, the largest positive integer is returned. Otherwise, if the conversion overflows, the largest integer with the same sign as `a' is returned.

func (X80) ToInt64 ¶

func (a X80) ToInt64() int64

ToInt64 returns the result of converting the extended double-precision floating- point value `a' to the 64-bit two's complement integer format. The conversion is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic---which means in particular that the conversion is rounded according to the current rounding mode. If `a' is a NaN, the largest positive integer is returned. Otherwise, if the conversion overflows, the largest integer with the same sign as `a' is returned.

func (X80) ToInt64RoundZero ¶

func (a X80) ToInt64RoundZero() int64

ToInt64RoundZero returns the result of converting the extended double-precision floating-point value `a' to the 64-bit two's complement integer format. The conversion is performed according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic, except that the conversion is always rounded toward zero. If `a' is a NaN, the largest positive integer is returned. Otherwise, if the conversion overflows, the largest integer with the same sign as `a' is returned.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL