element

package
v0.0.0-...-4a97a7c Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 3, 2024 License: Apache-2.0 Imports: 0 Imported by: 0

Documentation

Index

Constants

View Source
const Arith = `` /* 1147-byte string literal not displayed */
View Source
const Asm = `
import 	"golang.org/x/sys/cpu"

var (
	supportAdx = cpu.X86.HasADX && cpu.X86.HasBMI2
	_ = supportAdx
)
`

Asm ...

View Source
const AsmNoAdx = `` /* 230-byte string literal not displayed */

AsmNoAdx ...

View Source
const Avx = `` /* 132-byte string literal not displayed */
View Source
const Base = `` /* 17157-byte string literal not displayed */
View Source
const BigNum = `` /* 1349-byte string literal not displayed */
View Source
const Conv = `` /* 10051-byte string literal not displayed */
View Source
const Doc = `` /* 1140-byte string literal not displayed */
View Source
const Exp = `` /* 566-byte string literal not displayed */
View Source
const FixedExp = `` /* 1756-byte string literal not displayed */
View Source
const Inverse = `` /* 11275-byte string literal not displayed */
View Source
const InverseTests = `` /* 11516-byte string literal not displayed */
View Source
const MulCIOS = `` /* 2803-byte string literal not displayed */

MulCIOS text book CIOS works for all modulus.

Algorithm 2 of "Faster Montgomery Multiplication and Multi-Scalar-Multiplication for SNARKS" by Y. El Housni and G. Botrel https://doi.org/10.46586/tches.v2023.i3.504-521

There are couple of variations to the multiplication (and squaring) algorithms.

All versions are derived from the Montgomery CIOS algorithm: see section 2.3.2 of Tolga Acar's thesis https://www.microsoft.com/en-us/research/wp-content/uploads/1998/06/97Acar.pdf

For 1-word modulus, the generator will call mul_cios_one_limb (standard REDC)

For 13-word+ modulus, the generator will output a unoptimized textbook CIOS code, in plain Go.

For all other moduli, we look at the available bits in the last limb. If they are none (like secp256k1) we generate a unoptimized textbook CIOS code, in plain Go, for all architectures. If there is at least one we can omit a carry propagation in the CIOS algorithm. If there is at least two we can use the same technique for the CIOS Squaring. See appendix in https://eprint.iacr.org/2022/1400.pdf for the exact condition.

In practice, we have 3 different targets in mind: x86(amd64), arm64 and wasm.

For amd64, we can leverage (when available) the BMI2 and ADX instructions to have 2-carry-chains in parallel. This make the use of assembly worth it as it results in a significant perf improvement; most CPUs since 2016 support these instructions, and we assume it to be the "default path"; in case the CPU has no support, we fall back to a slow, unoptimized version.

On amd64, the Squaring algorithm always call the Multiplication (assembly) implementation.

For arm64, we unroll the loops in the CIOS (+nocarry optimization) algorithm, such that the instructions generated by the Go compiler closely match what we would hand-write. Hence, there is no assembly needed for arm64 target.

Additionally, if 2-bits+ are available on the last limb, we have a template to generate a dedicated Squaring algorithm This is not activated by default, to minimize the codebase size. On M1, AWS Graviton3 it results in a 5-10% speedup. On some mobile devices, speed up observed was more important (~20%).

The same (arm64) unrolled Go code produce satisfying performance for WASM (compiled using TinyGo).

View Source
const MulDoc = `` /* 218-byte string literal not displayed */
View Source
const MulNoCarry = `` /* 6220-byte string literal not displayed */

MulNoCarry Algorithm 2 of "Faster Montgomery Multiplication and Multi-Scalar-Multiplication for SNARKS" by Y. El Housni and G. Botrel https://doi.org/10.46586/tches.v2023.i3.504-521 Note that these templates are optimized for arm64 target, since x86 benefits from assembly impl.

View Source
const NoAvx = `
const supportAvx512 = false
`
View Source
const OpsAMD64 = `` /* 4944-byte string literal not displayed */

OpsAMD64 is included with AMD64 builds (regardless of architecture or if F.ASM is set)

View Source
const OpsNoAsm = `` /* 3194-byte string literal not displayed */
View Source
const Reduce = `` /* 422-byte string literal not displayed */
View Source
const Sqrt = `` /* 3421-byte string literal not displayed */
View Source
const Test = `` /* 38041-byte string literal not displayed */
View Source
const TestVector = `` /* 7353-byte string literal not displayed */
View Source
const Vector = `` /* 8457-byte string literal not displayed */

Variables

This section is empty.

Functions

This section is empty.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL