cuda

package
v3.3.1+incompatible Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 13, 2014 License: GPL-3.0 Imports: 13 Imported by: 112

Documentation

Overview

Low-level CUDA functionality. Inside this package, all data is stored in ZYX coordinates. I.e., our X axis (component 0) is perpendicular to the thin film and has usually a small number of cells (or even just one). Z is usually the longest axis. This annoying convention is imposed by the cuda FFT data layout.

Index

Constants

View Source
const CONV_TOLERANCE = 1e-6

Maximum tolerable error on demag convolution self-test.

View Source
const DEFAULT_KERNEL_ACC = 6

Default accuracy setting for demag kernel.

View Source
const FFT_IMAG_TOLERANCE = 1e-5

Maximum tolerable imaginary/real part for demag kernel in Fourier space. Assures kernel has correct symmetry.

View Source
const REDUCE_BLOCKSIZE = C.REDUCE_BLOCKSIZE

Block size for reduce kernels.

Variables

View Source
var (
	Version  float32
	DevName  string
	TotalMem int64
)
View Source
var (
	BlockSize    = 512
	TileX, TileY = 32, 32
	MaxGridSize  = 65535
)

CUDA Launch parameters. TODO: optimize?

Functions

func AddCubicAnisotropy

func AddCubicAnisotropy(Beff, m *data.Slice, k1_red LUTPtr, c1, c2 LUTPtrs, regions *Bytes, str int)

func AddDMI

func AddDMI(Beff *data.Slice, m *data.Slice, D_red0, D_red1, D_red2, A_red float32, str int)

Add effective field of Dzyaloshinskii-Moriya interaction to Beff (Tesla). According to Bagdanov and Röβler, PRL 87, 3, 2001. eq.8 (out-of-plane symmetry breaking). See dmi.cu

func AddExchange

func AddExchange(B, m *data.Slice, Aex_red SymmLUT, regions *Bytes, str int)

Add exchange field to Beff.

m: normalized magnetization
B: effective field in Tesla
Aex_red: 2*Aex / (Msat * 1e18 m2)

func AddSlonczewskiTorque

func AddSlonczewskiTorque(torque, m, J *data.Slice, fixedP LUTPtrs, Msat, alpha, pol, λ, ε_prime LUTPtr, regions *Bytes)

func AddTemperature

func AddTemperature(Beff, noise *data.Slice, temp_red LUTPtr, kmu0_VgammaDt float64, regions *Bytes, str int)

func AddUniaxialAnisotropy

func AddUniaxialAnisotropy(Beff, m *data.Slice, k1_red LUTPtr, u LUTPtrs, regions *Bytes, str int)

Add uniaxial magnetocrystalline anisotropy field to Beff.

func AddZhangLiTorque

func AddZhangLiTorque(torque, m, jpol *data.Slice, bsat, alpha, xi, pol LUTPtr, regions *Bytes)

func Buffer

func Buffer(nComp int, m *data.Mesh) *data.Slice

Returns a GPU slice for temporary use. To be returned to the pool with Recycle

func Dot

func Dot(a, b *data.Slice) float32

Dot product.

func GPUCopy

func GPUCopy(in *data.Slice) *data.Slice

Returns a copy of in, allocated on GPU.

func GetCell

func GetCell(s *data.Slice, comp, i, j, k int) float32

func GetElem

func GetElem(s *data.Slice, comp int, index int) float32

func Init

func Init(gpu int, sched string)

func LLTorque

func LLTorque(torque, m, B *data.Slice, alpha LUTPtr, regions *Bytes)

Landau-Lifshitz torque divided by gamma0:

  • 1/(1+α²) [ m x B + α m x (m x B) ] torque in Tesla m normalized B in Tesla

func LockThread

func LockThread()

LockCudaThread locks the current goroutine to an OS thread and sets the CUDA context for that thread. To be called by every fresh goroutine that will use CUDA.

func Madd2

func Madd2(dst, src1, src2 *data.Slice, factor1, factor2 float32)

multiply-add: dst[i] = src1[i] * factor1 + src2[i] * factor2

func Madd3

func Madd3(dst, src1, src2, src3 *data.Slice, factor1, factor2, factor3 float32)

multiply-add: dst[i] = src1[i] * factor1 + src2[i] * factor2 + src3 * factor3

func MaxAbs

func MaxAbs(in *data.Slice) float32

Maximum of absolute values of all elements.

func MaxVecDiff

func MaxVecDiff(x, y *data.Slice) float64

Maximum of the norms of the difference between all vectors (x1,y1,z1) and (x2,y2,z2)

(dx, dy, dz) = (x1, y1, z1) - (x2, y2, z2)
max_i sqrt( dx[i]*dx[i] + dy[i]*dy[i] + dz[i]*dz[i] )

func MaxVecNorm

func MaxVecNorm(v *data.Slice) float64

Maximum of the norms of all vectors (x[i], y[i], z[i]).

max_i sqrt( x[i]*x[i] + y[i]*y[i] + z[i]*z[i] )

func MemAlloc

func MemAlloc(bytes int64) unsafe.Pointer

Wrapper for cu.MemAlloc, fatal exit on out of memory.

func Memset

func Memset(s *data.Slice, val ...float32)

Memset sets the Slice's components to the specified values.

func Mul

func Mul(dst, a, b *data.Slice)

multiply: dst[i] = a[i] * b[i]

func NewSlice

func NewSlice(nComp int, m *data.Mesh) *data.Slice

Make a GPU Slice with nComp components each of size length.

func NewUnifiedSlice

func NewUnifiedSlice(nComp int, m *data.Mesh) *data.Slice

Make a GPU Slice with nComp components each of size length.

func Normalize

func Normalize(vec, vol *data.Slice)

Normalize vec to unit length, unless length or vol are zero.

func Recycle

func Recycle(s *data.Slice)

Returns a buffer obtained from GetBuffer to the pool.

func RegionAddV

func RegionAddV(dst *data.Slice, lut LUTPtrs, regions *Bytes, str int)

dst += LUT[region], for vectors. Used for complex excitation.

func RegionDecode

func RegionDecode(dst *data.Slice, lut LUTPtr, regions *Bytes)

decode the regions+LUT pair into an uncompressed array

func RegionSelect

func RegionSelect(dst, src *data.Slice, regions *Bytes, region byte)

select the part of src within the specified region, set 0's everywhere else.

func Resize

func Resize(dst, src *data.Slice, layer int)

Select and resize one layer for interactive output

func SetCell

func SetCell(s *data.Slice, comp int, i, j, k int, value float32)

func SetElem

func SetElem(s *data.Slice, comp int, index int, value float32)

func Shift

func Shift(dst, src *data.Slice, shift [3]int)

Copy dst to src, shifting data by given number of cells. Off-boundary values are clamped. Used, e.g., to make the simulation window follow interesting features.

func ShiftBytes

func ShiftBytes(dst, src *Bytes, m *data.Mesh, shift [3]int)

Like Shift, but for bytes

func Sum

func Sum(in *data.Slice) float32

Sum of all elements.

func Sync

func Sync(str int)

func SyncAll

func SyncAll()

func Zero

func Zero(s *data.Slice)

Set all elements of all components to zero.

Types

type Bytes

type Bytes struct {
	Ptr unsafe.Pointer
	Len int
}

3D byte slice, used for region lookup.

func NewBytes

func NewBytes(m *data.Mesh) *Bytes

Construct new 3D byte slice for given mesh.

func (*Bytes) Copy

func (dst *Bytes) Copy(src *Bytes)

func (*Bytes) Free

func (b *Bytes) Free()

Frees the GPU memory and disables the slice.

func (*Bytes) Upload

func (dst *Bytes) Upload(src []byte)

Upload src (host) to dst (gpu)

type DemagConvolution

type DemagConvolution struct {
	FFTMesh data.Mesh // mesh of FFT m
	// contains filtered or unexported fields
}

Stores the necessary state to perform FFT-accelerated convolution with magnetostatic kernel (or other kernel of same symmetry).

func NewDemag

func NewDemag(mesh *data.Mesh) *DemagConvolution

Initializes a convolution to evaluate the demag field for the given mesh geometry.

func (*DemagConvolution) Exec

func (c *DemagConvolution) Exec(B, m, vol *data.Slice, Bsat LUTPtr, regions *Bytes)

Calculate the demag field of m * vol * Bsat, store result in B.

m:    magnetization normalized to unit length
vol:  unitless mask used to scale m's length, may be nil
Bsat: saturation magnetization in Tesla
B:    resulting demag field, in Tesla

func (*DemagConvolution) FFT

func (c *DemagConvolution) FFT(m, vol *data.Slice, comp int, Bsat LUTPtr, regions *Bytes) *data.Slice

forward FFT of magnetization one component. returned slice is valid until next FFT or convolution

type LUTPtr

type LUTPtr unsafe.Pointer // points to 256 float32's

type LUTPtrs

type LUTPtrs []unsafe.Pointer // elements point to 256 float32's

type SymmLUT

type SymmLUT unsafe.Pointer // points to 256x256 symmetric matrix, only lower half stored

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL