cuda

package

v3.0.12+incompatible Latest Latest Go to latest Published: Sep 20, 2013 License: GPL-3.0 Imports: 13 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

Documentation ¶

Overview ¶

Low-level CUDA functionality. Inside this package, all data is stored in ZYX coordinates. I.e., our X axis (component 0) is perpendicular to the thin film and has usually a small number of cells (or even just one). Z is usually the longest axis. This annoying convention is imposed by the cuda FFT data layout.

Index ¶

Constants
Variables
func AddCubicAnisotropy(Beff, m *data.Slice, k1_red LUTPtr, c1, c2 LUTPtrs, regions *Bytes)
func AddDMI(Beff *data.Slice, m *data.Slice, D_red, A_red float32)
func AddExchange(B, m *data.Slice, Aex_red SymmLUT, regions *Bytes)
func AddUniaxialAnisotropy(Beff, m *data.Slice, k1_red LUTPtr, u LUTPtrs, regions *Bytes)
func AddZhangLiTorque(torque, m, jpol *data.Slice, bsat, alpha, xi LUTPtr, regions *Bytes)
func Buffer(nComp int, m *data.Mesh) *data.Slice
func Dot(a, b *data.Slice) float32
func GPUCopy(in *data.Slice) *data.Slice
func GetCell(s *data.Slice, comp, i, j, k int) float32
func GetElem(s *data.Slice, comp int, index int) float32
func Init(gpu int, sched string)
func LLTorque(torque, m, B *data.Slice, alpha LUTPtr, regions *Bytes)
func LockThread()
func Madd2(dst, src1, src2 *data.Slice, factor1, factor2 float32)
func Madd3(dst, src1, src2, src3 *data.Slice, factor1, factor2, factor3 float32)
func MaxAbs(in *data.Slice) float32
func MaxVecDiff(x, y *data.Slice) float64
func MaxVecNorm(v *data.Slice) float64
func MemAlloc(bytes int64) unsafe.Pointer
func Memset(s *data.Slice, val ...float32)
func Mul(dst, a, b *data.Slice)
func NewSlice(nComp int, m *data.Mesh) *data.Slice
func NewUnifiedSlice(nComp int, m *data.Mesh) *data.Slice
func Normalize(vec, vol *data.Slice)
func Recycle(s *data.Slice)
func RegionAddV(dst *data.Slice, lut LUTPtrs, regions *Bytes)
func RegionDecode(dst *data.Slice, lut LUTPtr, regions *Bytes)
func RegionSelect(dst, src *data.Slice, regions *Bytes, region byte)
func SetCell(s *data.Slice, comp int, i, j, k int, value float32)
func SetElem(s *data.Slice, comp int, index int, value float32)
func Shift(dst, src *data.Slice, shift [3]int)
func Sum(in *data.Slice) float32
func Zero(s *data.Slice)
type Bytes
- func NewBytes(m *data.Mesh) *Bytes
- func (b *Bytes) Free()
- func (dst *Bytes) Upload(src []byte)
type DemagConvolution
- func NewDemag(mesh *data.Mesh) *DemagConvolution
- func (c *DemagConvolution) Exec(B, m, vol *data.Slice, Bsat LUTPtr, regions *Bytes)
- func (c *DemagConvolution) FFT(m, vol *data.Slice, comp int, Bsat LUTPtr, regions *Bytes) *data.Slice
type Heun
- func NewHeun(y *data.Slice, torqueFn, postStep func(*data.Slice), dt, multiplier float64, ...) *Heun
- func (e *Heun) Step()
type LUTPtr
type LUTPtrs
type SymmLUT

Constants ¶

View Source

const CONV_TOLERANCE = 1e-6

Maximum tolerable error on demag convolution self-test.

View Source

const DEFAULT_KERNEL_ACC = 6

Default accuracy setting for demag kernel.

View Source

const FFT_IMAG_TOLERANCE = 1e-5

Maximum tolerable imaginary/real part for demag kernel in Fourier space. Assures kernel has correct symmetry.

View Source

const REDUCE_BLOCKSIZE = C.REDUCE_BLOCKSIZE

Block size for reduce kernels.

Variables ¶

View Source

var (
	Version  float32
	DevName  string
	TotalMem int64
)

View Source

var (
	BlockSize    = 512
	TileX, TileY = 32, 32
	MaxGridSize  = 65535
)

CUDA Launch parameters. TODO: optimize?

Functions ¶

func AddCubicAnisotropy ¶

func AddCubicAnisotropy(Beff, m *data.Slice, k1_red LUTPtr, c1, c2 LUTPtrs, regions *Bytes)

func AddDMI ¶

func AddDMI(Beff *data.Slice, m *data.Slice, D_red, A_red float32)

Add effective field of Dzyaloshinskii-Moriya interaction to Beff (Tesla). According to Bagdanov and Röβler, PRL 87, 3, 2001. eq.8 (out-of-plane symmetry breaking). See dmi.cu

func AddExchange ¶

func AddExchange(B, m *data.Slice, Aex_red SymmLUT, regions *Bytes)

Add exchange field to Beff.

m: normalized magnetization
B: effective field in Tesla
Aex_red: 2*Aex / (Msat * 1e18 m2)

func AddUniaxialAnisotropy ¶

func AddUniaxialAnisotropy(Beff, m *data.Slice, k1_red LUTPtr, u LUTPtrs, regions *Bytes)

Add uniaxial magnetocrystalline anisotropy field to Beff.

func AddZhangLiTorque ¶

func AddZhangLiTorque(torque, m, jpol *data.Slice, bsat, alpha, xi LUTPtr, regions *Bytes)

func Buffer ¶

func Buffer(nComp int, m *data.Mesh) *data.Slice

Returns a GPU slice for temporary use. To be returned to the pool with Recycle

func Dot ¶

func Dot(a, b *data.Slice) float32

Dot product.

func GPUCopy ¶

func GPUCopy(in *data.Slice) *data.Slice

Returns a copy of in, allocated on GPU.

func GetCell ¶

func GetCell(s *data.Slice, comp, i, j, k int) float32

func GetElem ¶

func GetElem(s *data.Slice, comp int, index int) float32

func Init ¶

func Init(gpu int, sched string)

func LLTorque ¶

func LLTorque(torque, m, B *data.Slice, alpha LUTPtr, regions *Bytes)

Landau-Lifshitz torque divided by gamma0:

1/(1+α²) [ m x B + α m x (m x B) ] torque in Tesla m normalized B in Tesla

func LockThread ¶

func LockThread()

LockCudaThread locks the current goroutine to an OS thread and sets the CUDA context for that thread. To be called by every fresh goroutine that will use CUDA.

func Madd2 ¶

func Madd2(dst, src1, src2 *data.Slice, factor1, factor2 float32)

multiply-add: dst[i] = src1[i] * factor1 + src2[i] * factor2

func Madd3 ¶

func Madd3(dst, src1, src2, src3 *data.Slice, factor1, factor2, factor3 float32)

multiply-add: dst[i] = src1[i] * factor1 + src2[i] * factor2 + src3 * factor3

func MaxAbs ¶

func MaxAbs(in *data.Slice) float32

Maximum of absolute values of all elements.

func MaxVecDiff ¶

func MaxVecDiff(x, y *data.Slice) float64

Maximum of the norms of the difference between all vectors (x1,y1,z1) and (x2,y2,z2)

(dx, dy, dz) = (x1, y1, z1) - (x2, y2, z2)
max_i sqrt( dx[i]*dx[i] + dy[i]*dy[i] + dz[i]*dz[i] )

func MaxVecNorm ¶

func MaxVecNorm(v *data.Slice) float64

Maximum of the norms of all vectors (x[i], y[i], z[i]).

max_i sqrt( x[i]*x[i] + y[i]*y[i] + z[i]*z[i] )

func MemAlloc ¶

func MemAlloc(bytes int64) unsafe.Pointer

Wrapper for cu.MemAlloc, fatal exit on out of memory.

func Memset ¶

func Memset(s *data.Slice, val ...float32)

Memset sets the Slice's components to the specified values.

func Mul ¶

func Mul(dst, a, b *data.Slice)

multiply: dst[i] = a[i] * b[i]

func NewSlice ¶

func NewSlice(nComp int, m *data.Mesh) *data.Slice

Make a GPU Slice with nComp components each of size length.

func NewUnifiedSlice ¶

func NewUnifiedSlice(nComp int, m *data.Mesh) *data.Slice

Make a GPU Slice with nComp components each of size length.

func Normalize ¶

func Normalize(vec, vol *data.Slice)

Normalize vec to unit length, unless length or vol are zero.

func Recycle ¶

func Recycle(s *data.Slice)

Returns a buffer obtained from GetBuffer to the pool.

func RegionAddV ¶

func RegionAddV(dst *data.Slice, lut LUTPtrs, regions *Bytes)

dst += LUT[region], for vectors. Used for complex excitation.

func RegionDecode ¶

func RegionDecode(dst *data.Slice, lut LUTPtr, regions *Bytes)

decode the regions+LUT pair into an uncompressed array

func RegionSelect ¶

func RegionSelect(dst, src *data.Slice, regions *Bytes, region byte)

select the part of src within the specified region, set 0's everywhere else.

func SetCell ¶

func SetCell(s *data.Slice, comp int, i, j, k int, value float32)

func SetElem ¶

func SetElem(s *data.Slice, comp int, index int, value float32)

func Shift ¶

func Shift(dst, src *data.Slice, shift [3]int)

Copy dst to src, shifting data by given number of cells. Off-boundary values are clamped. Used, e.g., to make the simulation window follow interesting features.

func Sum ¶

func Sum(in *data.Slice) float32

Sum of all elements.

func Zero ¶

func Zero(s *data.Slice)

Set all elements of all components to zero.

Types ¶

type Bytes ¶

type Bytes struct {
	Ptr unsafe.Pointer
	Len int
}

3D byte slice, used for region lookup.

func NewBytes ¶

func NewBytes(m *data.Mesh) *Bytes

Construct new 3D byte slice for given mesh.

func (*Bytes) Free ¶

func (b *Bytes) Free()

Frees the GPU memory and disables the slice.

func (*Bytes) Upload ¶

func (dst *Bytes) Upload(src []byte)

Upload src (host) to dst (gpu)

type DemagConvolution ¶

type DemagConvolution struct {
	FFTMesh data.Mesh // mesh of FFT m
	// contains filtered or unexported fields
}

Stores the necessary state to perform FFT-accelerated convolution with magnetostatic kernel (or other kernel of same symmetry).

func NewDemag ¶

func NewDemag(mesh *data.Mesh) *DemagConvolution

Initializes a convolution to evaluate the demag field for the given mesh geometry.

func (*DemagConvolution) Exec ¶

func (c *DemagConvolution) Exec(B, m, vol *data.Slice, Bsat LUTPtr, regions *Bytes)

Calculate the demag field of m * vol * Bsat, store result in B.

m:    magnetization normalized to unit length
vol:  unitless mask used to scale m's length, may be nil
Bsat: saturation magnetization in Tesla
B:    resulting demag field, in Tesla

func (*DemagConvolution) FFT ¶

func (c *DemagConvolution) FFT(m, vol *data.Slice, comp int, Bsat LUTPtr, regions *Bytes) *data.Slice

forward FFT of magnetization one component. returned slice is valid until next FFT or convolution

type Heun ¶

type Heun struct {
	// contains filtered or unexported fields
}

Adaptive heun solver for vectors.

func NewHeun ¶

func NewHeun(y *data.Slice, torqueFn, postStep func(*data.Slice), dt, multiplier float64, time *float64) *Heun

func (*Heun) Step ¶

func (e *Heun) Step()

Take one time step

type LUTPtr ¶

type LUTPtr unsafe.Pointer // points to 256 float32's

type LUTPtrs ¶

type LUTPtrs []unsafe.Pointer // elements point to 256 float32's

type SymmLUT ¶

type SymmLUT unsafe.Pointer // points to 256x256 symmetric matrix, only lower half stored

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL