gpu

package

v0.0.0-...-a20b597 Latest Latest Go to latest Published: Feb 17, 2015 License: GPL-3.0 Imports: 9 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/godsic/hotspin

Links

Open Source Insights

Documentation ¶

Overview ¶

Package with multi-GPU primitives like array allocation, copying, ...

3D Array indexing.

Internal dimensions are labeled (I,J,K), I being the outermost dimension, K the innermost. A typical loop reads:

for i:=0; i<N0; i++{
	for j:=0; j<N1; j++{
		for k:=0; k<N2; k++{
			...
		}
	}
}

I may be a small dimension, but K must preferentially be large and align-able in CUDA memory.

The underlying contiguous storage is indexed as:

index := i*N1*N2 + j*N2 + k

The "internal" (I,J,K) dimensions correspond to the "user" dimensions (Z,Y,X)! Z is typically the smallest dimension like the thickness.

Slicing the geometry over multiple GPUs ¶

In the J-direction.

Index ¶

Constants
Variables
func Add(dst, a, b *Array)
func AddMadd(dst, a, b, c *Array, mul float64)
func BrillouinAsync(msat0 *Array, msat0T0 *Array, T *Array, Tc *Array, S *Array, msat0Mul float64, ...)
func CMaddAsync(dst *Array, scale complex128, kern, src *Array, stream Stream)
func CopyPad3D(dst, src *Array)
func CopyPad3DAsync(dst, src *Array)
func CopyUnPad3D(dst, src *Array)
func CpAsync(cp *Array, T *Array, Td *Array, n *Array, TdMul float64, stream Stream)
func Div(dst, a, b *Array)
func DivMulPow(dst, a, b, c *Array, p float64)
func Dot(dst, a, b *Array)
func DotMask(dst, a, b *Array, aMul, bMul []float64)
func DotSign(dst, a, b, c *Array)
func EnergyFlowAsync(w *Array, m *Array, R *Array, Tc *Array, S *Array, n *Array, SMul float64, ...)
func Exchange6Async(h, m, msat0T0, lex *Array, msat0T0Mul float64, lexMul2_cellSize2 []float64, ...)
func FFTNormLogic(logicSize []int) int
func FFTOutputSize(logicSize []int) []int
func InitGPU(device int, flags uint)
func KappaAsync(kappa *Array, msat0 *Array, msat0T0 *Array, T *Array, Tc *Array, S *Array, ...)
func LLBarLocal00NC(t *Array, h *Array, msat0T0 *Array, lambda *Array, lambdaMul []float64)
func LLBarLocal02C(t *Array, m *Array, h *Array, msat0T0 *Array, mu *Array, muMul []float64)
func LLBarLocal02NC(t *Array, m *Array, h *Array, msat0T0 *Array, mu *Array, muMul []float64)
func LLBarNonlocal00NC(t *Array, h *Array, msat0T0 *Array, lambda_e *Array, lambda_eMul []float64, ...)
func LLBarTorqueAsync(t *Array, M *Array, h *Array, msat0T0 *Array)
func LinearCombination2Async(dst *Array, a *Array, mulA float64, b *Array, mulB float64, stream Stream)
func LinearCombination3(dst *Array, a *Array, mulA float64, b *Array, mulB float64, c *Array, ...)
func LinearCombination3Async(dst *Array, a *Array, mulA float64, b *Array, mulB float64, c *Array, ...)
func LongFieldAsync(hlf *Array, m *Array, msat0T0 *Array, J *Array, n *Array, Tc *Array, Ts *Array, ...)
func MAdd1Async(a, b *Array, mulB float64, stream Stream)
func MAdd2Async(a, b *Array, mulB float64, c *Array, mulC float64, stream Stream)
func Madd(dst, a, b *Array, mulB float64)
func Mul(dst, a, b *Array)
func Normalize(m *Array)
func PartialMax(in, out *Array, blocks, threadsPerBlock, N int)
func PartialMaxAbs(in, out *Array, blocks, threadsPerBlock, N int)
func PartialMaxDiff(a, b, out *Array, blocks, threadsPerBlock, N int)
func PartialMaxNorm3Sq(x, y, z, out *Array, blocks, threadsPerBlock, N int)
func PartialMaxNorm3SqDiff(x1, y1, z1, x2, y2, z2, out *Array, blocks, threadsPerBlock, N int)
func PartialMaxSum(a, b, out *Array, blocks, threadsPerBlock, N int)
func PartialMin(in, out *Array, blocks, threadsPerBlock, N int)
func PartialSDot(in1, in2, out *Array, blocks, threadsPerBlock, N int)
func PartialSum(in, out *Array, blocks, threadsPerBlock, N int)
func Qinter_async(Qi *Array, Ti *Array, Tj *Array, Gij *Array, GijMul []float64, stream Stream)
func Qspat_async(Q *Array, T *Array, k *Array, kMul []float64, cs []float64, pbc []int)
func ScaleNoiseAniz(h, mu, T, msat0T0 *Array, muMul []float64, ...)
func SetDefaultFFT(name string)
func TensSYMMVecMul(...)
func TsSync(Ts *Array, msat *Array, msat0T0 *Array, Tc *Array, S *Array, msatMul float64, ...)
func UniaxialAnisotropyAsync(h, m *Array, KuMask, MsatMask *Array, Ku2_Mu0MSat float64, anisUMask *Array, ...)
func VecMadd(dst, a, b *Array, mulB []float64)
func WeightedAverage(dst, x0, x1, w0, w1, R *Array, w0Mul, w1Mul, RMul float64)
func ZeroArrayAsync(A *Array, stream Stream)
type Array
- func NewArray(components int, size3D []int) *Array
- func NilArray(components int, size3D []int) *Array
- func (a *Array) Alloc()
- func (a *Array) Assign(other *Array)
- func (a *Array) Component(i int) *Array
- func (dst *Array) CopyFromDevice(src *Array)
- func (dst *Array) CopyFromHost(src *host.Array)
- func (src *Array) CopyToHost(dst *host.Array)
- func (a *Array) DevicePtr() cu.DevicePtr
- func (v *Array) Free()
- func (b *Array) Get(comp, x, y, z int) float64
- func (a *Array) Init(components int, size3D []int, alloc bool)
- func (a *Array) IsNil() bool
- func (a *Array) Len() int
- func (src *Array) LocalCopy() *host.Array
- func (a *Array) NComp() int
- func (a *Array) PartLen3D() int
- func (a *Array) PartLen4D() int
- func (a *Array) PartSize() []int
- func (shared *Array) PointTo(original *Array, offset int)
- func (a *Array) Pointer() cu.DevicePtr
- func (b *Array) Set(comp, x, y, z int, value float64)
- func (a *Array) Size3D() []int
- func (a *Array) Size4D() []int
- func (a *Array) String() string
- func (a *Array) Zero()
type FFTInterface
- func NewFFTPlanX(dataSize, logicSize []int) FFTInterface
type FFTPlanX
- func (fft *FFTPlanX) Forward(in, out *Array)
- func (fft *FFTPlanX) Free()
- func (fft *FFTPlanX) Inverse(in, out *Array)
type Reductor
- func NewReductor(nComp int, size []int) *Reductor
- func (r *Reductor) Dot(in1, in2 *Array) float64
- func (r *Reductor) Free()
- func (r *Reductor) Init(nComp int, size []int)
- func (r *Reductor) Max(in *Array) float64
- func (r *Reductor) MaxAbs(in *Array) float64
- func (r *Reductor) MaxDiff(a, b *Array) float64
- func (r *Reductor) MaxNorm(a *Array) float64
- func (r *Reductor) MaxNormDiff(a, b *Array) float64
- func (r *Reductor) MaxSum(a, b *Array) float64
- func (r *Reductor) Min(in *Array) float64
- func (r *Reductor) Sum(in *Array) float64
type Stream
- func NewStream() Stream
- func (s Stream) Destroy()
- func (s Stream) Ready() (ready bool)
- func (s Stream) Sync()

Constants ¶

View Source

const (
	DO_ALLOC   = true
	DONT_ALLOC = false
)

Parameters for Array.Init()

View Source

const (
	MSG_BADDEVICEID       = "Invalid device ID: "
	MSG_DEVICEUNINITIATED = "Device list not initiated"
)

Error message

View Source

const ERR_UNIFIED_ADDR = "A GPU does not support unified addressing and can not be used in a multi-GPU setup."

Error message

View Source

const MSG_ARRAY_SIZE_MISMATCH = "array size mismatch"

Error message.

Variables ¶

View Source

var NewDefaultFFT func(dataSize, logicSize []int) FFTInterface = NewFFTPlanX // this default is for tests, not sims.

The default FFT constructor. The function pointer may be changed to use a different FFT implementation globally.

Functions ¶

func Add ¶

func Add(dst, a, b *Array)

Adds 2 multi-GPU arrays: dst = a + b

func AddMadd ¶

func AddMadd(dst, a, b, c *Array, mul float64)

Multiply-add: dst = a + mul* (b + c) b may NOT contain NULL pointers!

func BrillouinAsync ¶

func BrillouinAsync(msat0 *Array, msat0T0 *Array, T *Array, Tc *Array, S *Array, msat0Mul float64, msat0T0Mul float64, TcMul float64, SMul float64, stream Stream)

func CMaddAsync ¶

func CMaddAsync(dst *Array, scale complex128, kern, src *Array, stream Stream)

Complex multiply add. dst and src contain complex numbers (interleaved format) kern contains real numbers

dst[i] += scale * kern[i] * src[i]

func CopyPad3D ¶

func CopyPad3D(dst, src *Array)

Padding of a 3D matrix -> only to be used when Ndev=1 Copy from src to dst, which have different size3D. If dst is smaller, the src input is cropped to the right size. If dst is larger, the src input is padded with zeros to the right size.

func CopyPad3DAsync ¶

func CopyPad3DAsync(dst, src *Array)

func CopyUnPad3D ¶

func CopyUnPad3D(dst, src *Array)

func CpAsync ¶

func CpAsync(cp *Array, T *Array, Td *Array, n *Array, TdMul float64, stream Stream)

func Div ¶

func Div(dst, a, b *Array)

Divide 2 multi-GPU arrays: dst = a / b; _if_ b = 0 _then_ dst = 0

func DivMulPow ¶

func DivMulPow(dst, a, b, c *Array, p float64)

Divide and Multiply by the array raised to the Power : dst = pow(c, p) * a / b; _if_ b = 0 _then_ dst = a, _if_c = 0 _then_ dst = 0

func Dot ¶

func Dot(dst, a, b *Array)

Synchronous Dot product: C = AiBi, A and B should not be masks

func DotMask ¶

func DotMask(dst, a, b *Array, aMul, bMul []float64)

Synchronous Dot product: C = AiBi, A and B could be masks

func DotSign ¶

func DotSign(dst, a, b, c *Array)

Synchronous Singed Dot product: C = sign(BC) * (AB)

func EnergyFlowAsync ¶

func EnergyFlowAsync(w *Array, m *Array, R *Array, Tc *Array, S *Array, n *Array, SMul float64, stream Stream)

func Exchange6Async ¶

func Exchange6Async(h, m, msat0T0, lex *Array, msat0T0Mul float64, lexMul2_cellSize2 []float64, periodic []int, stream Stream)

6-neighbor exchange field. Aex2_mu0Msatmul: 2 * Aex / Mu0 * Msat.multiplier

func FFTNormLogic ¶

func FFTNormLogic(logicSize []int) int

Returns the normalization factor of an FFT with this logic size. (just the product of the sizes)

func FFTOutputSize ¶

func FFTOutputSize(logicSize []int) []int

Returns the output size of an FFT with given logic size.

func InitGPU ¶

func InitGPU(device int, flags uint)

Sets a list of devices to use.

func KappaAsync ¶

func KappaAsync(kappa *Array, msat0 *Array, msat0T0 *Array, T *Array, Tc *Array, S *Array, n *Array, msat0Mul float64, msat0T0Mul float64, TcMul float64, SMul float64, stream Stream)

func LLBarLocal00NC ¶

func LLBarLocal00NC(t *Array, h *Array, msat0T0 *Array, lambda *Array, lambdaMul []float64)

func LLBarLocal02C ¶

func LLBarLocal02C(t *Array, m *Array, h *Array, msat0T0 *Array, mu *Array, muMul []float64)

func LLBarLocal02NC ¶

func LLBarLocal02NC(t *Array, m *Array, h *Array, msat0T0 *Array, mu *Array, muMul []float64)

func LLBarNonlocal00NC ¶

func LLBarNonlocal00NC(t *Array, h *Array, msat0T0 *Array, lambda_e *Array, lambda_eMul []float64, cellsizeX float64, cellsizeY float64, cellsizeZ float64, pbc []int)

func LLBarTorqueAsync ¶

func LLBarTorqueAsync(t *Array, M *Array, h *Array, msat0T0 *Array)

func LinearCombination2Async ¶

func LinearCombination2Async(dst *Array, a *Array, mulA float64, b *Array, mulB float64, stream Stream)

dst[i] = a[i]*mulA + b[i]*mulB

func LinearCombination3 ¶

func LinearCombination3(dst *Array, a *Array, mulA float64, b *Array, mulB float64, c *Array, mulC float64)

dst[i] = a[i]*mulA + b[i]*mulB + c[i]*mulC

func LinearCombination3Async ¶

func LinearCombination3Async(dst *Array, a *Array, mulA float64, b *Array, mulB float64, c *Array, mulC float64, stream Stream)

dst[i] = a[i]*mulA + b[i]*mulB + c[i]*mulC

func LongFieldAsync ¶

func LongFieldAsync(hlf *Array, m *Array, msat0T0 *Array, J *Array, n *Array, Tc *Array, Ts *Array, msat0T0Mul float64, JMul float64, nMul float64, TcMul float64, TsMul float64, stream Stream)

func MAdd1Async ¶

func MAdd1Async(a, b *Array, mulB float64, stream Stream)

Asynchronous multiply-add: a += mulB*b b may contain NULL pointers, implemented as all 1's.

func MAdd2Async ¶

func MAdd2Async(a, b *Array, mulB float64, c *Array, mulC float64, stream Stream)

Asynchronous multiply-add: a += mulB*b + mulC*c b,c may contain NULL pointers, implemented as all 1's.

func Madd ¶

func Madd(dst, a, b *Array, mulB float64)

func Mul ¶

func Mul(dst, a, b *Array)

Multiply 2 multi-GPU arrays: dst = a * b

func Normalize ¶

func Normalize(m *Array)

Normalize

func PartialMax ¶

func PartialMax(in, out *Array, blocks, threadsPerBlock, N int)

Partial maxima (see reduce.h)

func PartialMaxAbs ¶

func PartialMaxAbs(in, out *Array, blocks, threadsPerBlock, N int)

Partial maxima of absolute values (see reduce.h)

func PartialMaxDiff ¶

func PartialMaxDiff(a, b, out *Array, blocks, threadsPerBlock, N int)

Partial maximum difference between arrays (see reduce.h)

func PartialMaxNorm3Sq ¶

func PartialMaxNorm3Sq(x, y, z, out *Array, blocks, threadsPerBlock, N int)

Partial maximum of Euclidian norm squared (see reduce.h)

func PartialMaxNorm3SqDiff ¶

func PartialMaxNorm3SqDiff(x1, y1, z1, x2, y2, z2, out *Array, blocks, threadsPerBlock, N int)

Partial maximum of Euclidian norm squared of difference between two 3-vector arrays(see reduce.h)

func PartialMaxSum ¶

func PartialMaxSum(a, b, out *Array, blocks, threadsPerBlock, N int)

Partial maximum difference between arrays (see reduce.h)

func PartialMin ¶

func PartialMin(in, out *Array, blocks, threadsPerBlock, N int)

Partial minima (see reduce.h)

func PartialSDot ¶

func PartialSDot(in1, in2, out *Array, blocks, threadsPerBlock, N int)

Partial dot products (see reduce.h)

func PartialSum ¶

func PartialSum(in, out *Array, blocks, threadsPerBlock, N int)

Partial sums (see reduce.h)

func Qinter_async ¶

func Qinter_async(Qi *Array, Ti *Array, Tj *Array, Gij *Array, GijMul []float64, stream Stream)

func Qspat_async ¶

func Qspat_async(Q *Array, T *Array, k *Array, kMul []float64, cs []float64, pbc []int)

func ScaleNoiseAniz ¶

func ScaleNoiseAniz(h, mu, T, msat0T0 *Array,
	muMul []float64,
	KB2tempMul_mu0VgammaDtMsatMul float64)

func SetDefaultFFT ¶

func SetDefaultFFT(name string)

Sets a global default FFT

func TensSYMMVecMul ¶

func TensSYMMVecMul(dstX, dstY, dstZ, srcX, srcY, srcZ, kernXX, kernYY, kernZZ, kernYZ, kernXZ, kernXY *Array,
	srcMul float64,
	Nx, Ny, Nz int, stream Stream)

func TsSync ¶

func TsSync(Ts *Array, msat *Array, msat0T0 *Array, Tc *Array, S *Array, msatMul float64, msat0T0Mul float64, TcMul float64, SMul float64)

func UniaxialAnisotropyAsync ¶

func UniaxialAnisotropyAsync(h, m *Array, KuMask, MsatMask *Array, Ku2_Mu0MSat float64, anisUMask *Array, anisUMul []float64, stream Stream)

Computes the uniaxial anisotropy field, stores in h.

func VecMadd ¶

func VecMadd(dst, a, b *Array, mulB []float64)

3-vector multiply-add: dst_i = a_i + mulB_i*b_i b may contain NULL pointers, implemented as all 1's.

func WeightedAverage ¶

func WeightedAverage(dst, x0, x1, w0, w1, R *Array, w0Mul, w1Mul, RMul float64)

func ZeroArrayAsync ¶

func ZeroArrayAsync(A *Array, stream Stream)

Types ¶

type Array ¶

type Array struct {
	Stream         // GPU stream for general use with this array
	Comp   []Array // X,Y,Z components as arrays
	// contains filtered or unexported fields
}

A MuMax Array represents a 3-dimensional array of N-vectors.

Layout example for a (3,4) vsplice on 2 GPUs:

GPU0: X0 X1  Y0 Y1 Z0 Z1
GPU1: X2 X3  Y2 Y3 Z2 Z3

func NewArray ¶

func NewArray(components int, size3D []int) *Array

Returns an array which holds a field with the number of components and given size.

func NilArray ¶

func NilArray(components int, size3D []int) *Array

Returns an array without underlying storage. This is used for space-independent quantities. These pass a multiplier value and a null pointer for each GPU. A NilArray already has null pointers for each GPU set, so it is more convenient than just a nil pointer of type *Array. See: Alloc()

func (*Array) Alloc ¶

func (a *Array) Alloc()

If the array has no underlying storage yet (e.g., it was created by NilArray()), allocate that storage.

func (*Array) Assign ¶

func (a *Array) Assign(other *Array)

a = other (accessible from packages where Array is not assignable)

func (*Array) Component ¶

func (a *Array) Component(i int) *Array

Gets the i'th component as an array. E.g.: Component(0) is the x-component.

func (*Array) CopyFromDevice ¶

func (dst *Array) CopyFromDevice(src *Array)

Copy from device array to device array.

func (*Array) CopyFromHost ¶

func (dst *Array) CopyFromHost(src *host.Array)

Copy from host array to device array.

func (*Array) CopyToHost ¶

func (src *Array) CopyToHost(dst *host.Array)

Copy from device array to host array.

func (*Array) DevicePtr ¶

func (a *Array) DevicePtr() cu.DevicePtr

Address of part of the array on each GPU device

func (*Array) Free ¶

func (v *Array) Free()

Frees the underlying storage and sets the size to zero.

func (*Array) Get ¶

func (b *Array) Get(comp, x, y, z int) float64

Get a single value

func (*Array) Init ¶

func (a *Array) Init(components int, size3D []int, alloc bool)

Initializes the array to hold a field with the number of components and given size.

Init(3, 1000) // gives an array of 1000 3-vectors
Init(1, 1000) // gives an array of 1000 scalars
Init(6, 1000) // gives an array of 1000 6-vectors or symmetric tensors

Storage is allocated only if alloc == true.

func (*Array) IsNil ¶

func (a *Array) IsNil() bool

True if the array has no underlying GPU storage. E.g., when created by NilArray()

func (*Array) Len ¶

func (a *Array) Len() int

Total number of elements

func (*Array) LocalCopy ¶

func (src *Array) LocalCopy() *host.Array

DEBUG: Make a freshly allocated copy on the host.

func (*Array) NComp ¶

func (a *Array) NComp() int

Number of components (1: scalar, 3: vector, ...).

func (*Array) PartLen3D ¶

func (a *Array) PartLen3D() int

Number of elements per component per GPU

func (*Array) PartLen4D ¶

func (a *Array) PartLen4D() int

Total number of elements per GPU

func (*Array) PartSize ¶

func (a *Array) PartSize() []int

Size of each part per GPU

func (*Array) PointTo ¶

func (shared *Array) PointTo(original *Array, offset int)

Lets the pointers of an already initialized, but not allocated array (shared) point to an allocated array (original) possibly with an offset.

func (*Array) Pointer ¶

func (a *Array) Pointer() cu.DevicePtr

Array of pointers to parts, one per GPU.

func (*Array) Set ¶

func (b *Array) Set(comp, x, y, z int, value float64)

Set a single value

func (*Array) Size3D ¶

func (a *Array) Size3D() []int

Size of the vector field.

func (*Array) Size4D ¶

func (a *Array) Size4D() []int

Number of components + size of the vector field.

func (*Array) String ¶

func (a *Array) String() string

Human-readable string.

func (*Array) Zero ¶

func (a *Array) Zero()

Makes all elements zero.

type FFTInterface ¶

type FFTInterface interface {
	Forward(in, out *Array)
	Inverse(in, out *Array)
	Free()
}

Interface for any sparse FFT plan.

func NewFFTPlanX ¶

func NewFFTPlanX(dataSize, logicSize []int) FFTInterface

type FFTPlanX ¶

type FFTPlanX struct {
	Stream
	// contains filtered or unexported fields
}

func (*FFTPlanX) Forward ¶

func (fft *FFTPlanX) Forward(in, out *Array)

func (*FFTPlanX) Free ¶

func (fft *FFTPlanX) Free()

func (*FFTPlanX) Inverse ¶

func (fft *FFTPlanX) Inverse(in, out *Array)

type Reductor ¶

type Reductor struct {
	N int
	// contains filtered or unexported fields
}

A Reductor stores the necessary buffers to reduce data on the multi-GPU. It can be used to sum data, take minima, maxima, etc...

func NewReductor ¶

func NewReductor(nComp int, size []int) *Reductor

Make reductor to reduce an array of given size

func (*Reductor) Dot ¶

func (r *Reductor) Dot(in1, in2 *Array) float64

Takes the dot product of all elements of the arrays.

func (*Reductor) Free ¶

func (r *Reductor) Free()

Frees the GPU buffer storage.

func (*Reductor) Init ¶

func (r *Reductor) Init(nComp int, size []int)

Initiate buffers to reduce an array of given size

func (*Reductor) Max ¶

func (r *Reductor) Max(in *Array) float64

Takes the maximum of all elements of the array.

func (*Reductor) MaxAbs ¶

func (r *Reductor) MaxAbs(in *Array) float64

Takes the maximum of absolute values of all elements of the array.

func (*Reductor) MaxDiff ¶

func (r *Reductor) MaxDiff(a, b *Array) float64

Takes the maximum absolute difference between the elements of a and b.

func (*Reductor) MaxNorm ¶

func (r *Reductor) MaxNorm(a *Array) float64

Takes the maximum norm of a 3-component (vector) array.

func (*Reductor) MaxNormDiff ¶

func (r *Reductor) MaxNormDiff(a, b *Array) float64

Takes the maximum norm of the difference between two 3-component (vector) arrays.

func (*Reductor) MaxSum ¶

func (r *Reductor) MaxSum(a, b *Array) float64

Takes the maximum absolute sum between the elements of a and b.

func (*Reductor) Min ¶

func (r *Reductor) Min(in *Array) float64

Takes the minimum of all elements of the array.

func (*Reductor) Sum ¶

func (r *Reductor) Sum(in *Array) float64

Takes the sum of all elements of the array.

type Stream ¶

type Stream cu.Stream

var STREAM0 Stream

Stream 0 on each GPU

func NewStream ¶

func NewStream() Stream

Creates a new multi-GPU stream. Its use is similar as cu.Stream, but operates on all GPUs at the same time.

func (Stream) Destroy ¶

func (s Stream) Destroy()

Destroys the multi-GPU stream.

func (Stream) Ready ¶

func (s Stream) Ready() (ready bool)

Returns true if all underlying GPU streams have completed.

func (Stream) Sync ¶

func (s Stream) Sync()

Synchronizes with all underlying GPU-streams

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL