gocudnn

package module
v0.0.0-...-c9f06ed Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 13, 2020 License: MIT Imports: 13 Imported by: 32

README

gocudnn Coverage Status

Gopher

V0.1_75_101 is compiling. It is cudnn 7.5 w/ cuda 10.1, There might be bugs. Send me a pull request.

I made a BatchNormalD descriptor and BatchNormDEx descriptor. You will call this with a "Create" function. and set it like the other descriptors.

I also made a deconvoltuion descriptor. It should work. At least I don't receive any errors when doing the operations. Deconvolution works like a convolution except backward data is forward and forward is backward data.
The thing with a deconvolution is that the filter channels will be the output channel, and the filter neurons must match the input channels.

Convolution(Input{N,C,H,W}, Filter{P,C,R,S},Output{N,P,,})

Deconvolution(Input{N,C,H,W}, Filter{C,Q,R,S}, Output{N,Q,,})

gocu folder

The gocu folder contains interfaces that interconnect the different sub packages.
To help parallelize your code use the type Worker. It contains the method work. Where it takes a function at sends it to to be worked on a dedicated thread host thread. Like if you wanted to make a new Context to handle gpu management.

    type GPUcontext struct{
        w *gocu.Worker
        a *crtutil.Allocator
    }
    
    func CreateGPUcontext(dev gocu.Device,s gocu.Streamer)(g *GPUcontext,err error){
        g= new(GPUcontext)
        g.w = new(gocu.Worker)
        err = g.w.Work(func()error{
             g.a = crtutil.CreateAllocator(s)
             return nil
        })
      return err
    }


    func (g *GPUcontext)AllocateMemory(size uint)(c cutil.Mem,err  error){
      err=  g.w.Work(func()error{
            c,err=g.a.AllocateMemory(size)
            return err
        })
    return c,err
    }
    

cudart/crtutil folder

This folder has a ReadWriter in it. That fulfills the io.Reader and io.Writer interface.

Beta

I don't forsee any code breaking changes. Any changes will be new functions. There will be bugs. Report them or send me a pull request.

Some required packages

go get github.com/dereklstinson/half
go get github.com/dereklstinson/cutil

If I ever go to modules. These will be placed in there.

Setup

cuDNN 7.5 found at or around https://developer.nvidia.com/cudnn

CUDA 10.1 Toolkit found at or around https://developer.nvidia.com/cuda-downloads

Golang V1.13 found at or around https://golang.org/dl/

Will need to set the environmental variables to something along the lines as below.

export PATH=/usr/local/cuda-10.1/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64\
                         ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

export PATH=$PATH:/usr/local/go/bin

I would also like to get this to work on windows, also, but I am finding that windows, go, and cuda don't like to mesh together so well, at least not as intuitive as linux, go, and cuda.

Warnings/Notes

Documentation For cudnn can be found at https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html

Take a good look at chapter 2 to get an idea on how the cudnn library works.

The go bindings will be very similar to how cudnn is coded.

A few exceptions though:.

  1. Most descriptors will be handled with methods after they are created.
  2. All of the "get" functions will return multiple values.

A little more on flag handling

Flags are handled through methods. You must be careful. The methods used with flags will change the flag value. If you don't set the flag with a method. It will default with the initialized value (0). That may or may not be a flag option with cudnn or any of the other packages.

Note on Handles.

CreateHandle() is not thread safe. Lock the thread using runtime.LockOSThread(). If you get this running on a mac. Then your functions will need to be sent to the main thread.

CreateHandleEX() is designed for multiple gpu use. It takes a gocu.Worker and any function that takes the handle will pass that function to the worker. This is still not thread safe, because any gpu memory that the functions use (for the most part) need to be created on that worker. Also, before any memory is created the handle needs to be made.

To parallelize gpus you will need separate handles. Check out parallel_test.go

TensorD FilterD NHWC

I found out the for cudnn it always took NCHW as the dims even if the format was NHWC (oof). To me that didn't seem intuitive. Especially, since it is barely mentioned in the documentation. I am going to to have it so that if the format is chosen to be NHWC then you need to put the dims as NHWC. We will see how that works.

CUBLAS and CUDA additions

Other Notes

  1. I took errors.go from unixpickle/cuda. I really didn't want to have to rewrite that error stuff from the cuda runtime api.

Documentation

Index

Examples

Constants

View Source
const BnMinEpsilon = (float64)(C.CUDNN_BN_MIN_EPSILON)

BnMinEpsilon is the min epsilon for batchnorm It used to be 1e-5, but it is now 0

View Source
const CudnnSeqDataDimCount = C.CUDNN_SEQDATA_DIM_COUNT

CudnnSeqDataDimCount is a flag for the number of dims.

View Source
const DimMax = int32(C.CUDNN_DIM_MAX)

DimMax is the max dims for tensors

Variables

This section is empty.

Functions

func AddTensor

func AddTensor(h *Handle, alpha float64, aD *TensorD, A cutil.Mem, beta float64, cD *TensorD, c cutil.Mem) error

AddTensor Tensor Bias addition : C = alpha * A + beta * C // c is both the input and output From Documentation This function adds the scaled values of a bias tensor to another tensor. Each dimension of the bias tensor A must match the corresponding dimension of the destination tensor C or must be equal to 1. In the latter case, the same value from the bias tensor for those dimensions will be used to blend into the C tensor.

**Note: Up to dimension 5, all tensor formats are supported. Beyond those dimensions, this routine is not supported

func AddTensorUS

func AddTensorUS(h *Handle, alpha float64, aD *TensorD, A unsafe.Pointer, beta float64, cD *TensorD, c unsafe.Pointer) error

AddTensorUS is like AddTensor but uses unsafe.Pointer instead of cutil.Mem

func DebugMode

func DebugMode()

DebugMode is for debugging code soley for these bindings.

func FindLength

func FindLength(s uint, dtype DataType) uint32

FindLength returns the length of of the array considering the number of bytes and the Datatype

func FindSizeTfromVol

func FindSizeTfromVol(volume []int32, dtype DataType) uint

FindSizeTfromVol takes a volume of dims and returns the size in bytes in SizeT

func GetBindingVersion

func GetBindingVersion() (major, minor, patch int32)

GetBindingVersion will return the library version this binding was made for.

func GetCudaartVersion

func GetCudaartVersion() uint

GetCudaartVersion prints cuda run time version

func GetFoldedConvBackwardDataDescriptors

func GetFoldedConvBackwardDataDescriptors(h *Handle,
	filter *FilterD,
	diff *TensorD,
	conv *ConvolutionD,
	grad *TensorD,
	transform TensorFormat) (
	foldedfilter *FilterD,
	paddeddiff *TensorD,
	foldedConv *ConvolutionD,
	foldedgrad *TensorD,
	filterfold *TransformD,
	diffpad *TransformD,
	gradfold *TransformD,
	gradunfold *TransformD,
	err error)

GetFoldedConvBackwardDataDescriptors - Hidden Helper function to calculate folding descriptors for dgrad

func GetLibraryVersion

func GetLibraryVersion() (major, minor, patch int32, err error)

GetLibraryVersion will return the library version you have installed

func GetStringer

func GetStringer(tD *TensorD, t cutil.Pointer) (fmt.Stringer, error)

GetStringer returns a stringer that will pring cuda allocated memory formated in NHWC or NCHW. Only works for 4d tensors with float or half datatype. It will only print the data.

func GetVersion

func GetVersion() uint

GetVersion returns the version

func ScaleTensor

func ScaleTensor(h *Handle, yD *TensorD, y cutil.Mem, alpha float64) error

ScaleTensor - Scale all values of a tensor by a given factor : y[i] = alpha * y[i]

func ScaleTensorUS

func ScaleTensorUS(h *Handle, yD *TensorD, y unsafe.Pointer, alpha float64) error

ScaleTensorUS is like ScaleTensor but it uses unsafe.Pointer instead of cutil.Mem

func SetCallBack

func SetCallBack(udata fmt.Stringer, w io.Writer) error

SetCallBack sets the debug callback function. Callback data will be writer to the writer. udata is custom user data that will write to the call back. udata can be nil Callback is not functional.

func SetTensor

func SetTensor(h *Handle, yD *TensorD, y cutil.Mem, v float64) error

SetTensor - Set all values of a tensor to a given value : y[i] = value[0]

func SetTensorUS

func SetTensorUS(h *Handle, yD *TensorD, y unsafe.Pointer, v float64) error

SetTensorUS is like SetTensor but it uses unsafe.Pointer instead of cutil.Mem

func TransformTensor

func TransformTensor(h *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	beta float64,
	yD *TensorD, y cutil.Mem) error

TransformTensor see below

From the SDK Documentation: This function copies the scaled data from one tensor to another tensor with a different layout. Those descriptors need to have the same dimensions but not necessarily the same strides. The input and output tensors must not overlap in any way (i.e., tensors cannot be transformed in place). This function can be used to convert a tensor with an unsupported format to a supported one.

cudnnStatus_t cudnnTransformTensor(

    cudnnHandle_t                  handle,
    const void                    *alpha,
    const cudnnTensorDescriptor_t  xDesc,
    const void                    *x,
    const void                    *beta,
    const cudnnTensorDescriptor_t  yDesc,
	void                          *y)

y = Transfomr((alpha *x),(beta * y)) This will change the layout of a tensor stride wise

func TransformTensorUS

func TransformTensorUS(h *Handle, alpha float64, xD *TensorD, x unsafe.Pointer, beta float64, yD *TensorD, y unsafe.Pointer) error

TransformTensorUS is like TransformTensor but it uses unsafe.Pointer instead of cutil.Mem

Types

type ActivationD

type ActivationD struct {
	// contains filtered or unexported fields
}

ActivationD is an opaque struct that holds the description of an activation operation.

Example

ExampleActivationD of doing the activation function

package main

import (
	"runtime"

	"github.com/dereklstinson/gocudnn/gocu"

	gocudnn "github.com/dereklstinson/gocudnn"
)

func main() {
	runtime.LockOSThread()
	check := func(e error) {
		if e != nil {
			panic(e)
		}
	}

	h := gocudnn.CreateHandle(true) //Using go garbage collector

	ActOp, err := gocudnn.CreateActivationDescriptor()
	check(err)

	var AMode gocudnn.ActivationMode //Activation Mode Flag
	var NanMode gocudnn.NANProp      //Nan Propagation Flag

	err = ActOp.Set(AMode.Relu(), NanMode.Propigate(), 20)
	check(err)
	am, nm, coef, err := ActOp.Get() //Gets the calues that where set
	if am != AMode.Relu() || nm != NanMode.Propigate() || coef != 20 {
		panic("am!=Amode.Relu()||nm !=NanMode.Propigate()||coef!=20")
	}

	//Dummy Variables
	//Check TensorD to find out how to make xD,yD and x and y
	var x, y *gocu.CudaPtr
	var xD, yD *gocudnn.TensorD

	err = ActOp.Forward(h, 1, xD, x, 0, yD, y)
	check(err)
}
Output:

func CreateActivationDescriptor

func CreateActivationDescriptor() (*ActivationD, error)

CreateActivationDescriptor creates an activation descriptor

func (*ActivationD) Backward

func (a *ActivationD) Backward(
	handle *Handle,
	alpha float64,
	yD *TensorD, y cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	xD *TensorD, x cutil.Mem,
	beta float64,
	dxD *TensorD, dx cutil.Mem) error

Backward does the activation backward method

From deep learning sdk documentation (slightly modified for gocudnn):

This routine computes the gradient of a neuron activation function.

Note: In-place operation is allowed for this routine; i.e., dx and dy cutil.Mem may be equal. However, this requires dxD and dyD descriptors to be identical (particularly, the strides of the input and output must match for in-place operation to be allowed).

Note: All tensor formats are supported for 4 and 5 dimensions, however best performance is obtained when the strides of dxD and dyD are equal and HW-packed. For more than 5 dimensions the tensors must have their spatial dimensions packed.

Parameters:

---
handle(input):

previously created Handle
---
----
alpha, beta(input):

Pointers to scaling factors (in host memory) used to blend the computation result with prior
value in the output layer as follows: dstValue = alpha[0]*result + beta[0]*priorDstValue.
----
---
xD(input):

Handle to the previously initialized input tensor descriptor.
---
----
x(input):

Data pointer to GPU memory associated with the tensor descriptor xD.
----
---
dxD(input):

Handle to the previously initialized input tensor descriptor.
---
----
dx(output):

Data pointer to GPU memory associated with the tensor descriptor dxD.
----
---
yD(input):

Handle to the previously initialized output tensor descriptor.
---
----
y(input):

Data pointer to GPU memory associated with the output tensor descriptor yD.
----
---
dyD(input):

Handle to the previously initialized output tensor descriptor.
---
----
dy(input):

Data pointer to GPU memory associated with the output tensor descriptor dyD.
----

Possible Error Returns

	nil:

	The function launched successfully.

	CUDNN_STATUS_NOT_SUPPORTED:

	1) The dimensions n,c,h,w of the input tensor and output tensors differ.
 2) The datatype of the input tensor and output tensors differs.
 3) The strides nStride, cStride, hStride, wStride of the input tensor and the input differential tensor differ.
	4) The strides nStride, cStride, hStride, wStride of the output tensor and the output differential tensor differ.

	CUDNN_STATUS_BAD_PARAM:

	At least one of the following conditions are met:

	The strides nStride, cStride, hStride, wStride of the input differential tensor and output
	differential tensors differ and in-place operation is used.

	CUDNN_STATUS_EXECUTION_FAILED:

	The function failed to launch on the GPU.

func (*ActivationD) BackwardUS

func (a *ActivationD) BackwardUS(
	handle *Handle,
	alpha float64,
	yD *TensorD, y unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	xD *TensorD, x unsafe.Pointer,
	beta float64,
	dxD *TensorD, dx unsafe.Pointer) error

BackwardUS is just like Backward but it takes unsafe.Pointers instead of cutil.Mem

func (*ActivationD) Destroy

func (a *ActivationD) Destroy() error

Destroy destroys the activation descriptor if GC is not set. if not set method will only return nil Currently GC is always set with no way of turning it off

func (*ActivationD) Forward

func (a *ActivationD) Forward(
	handle *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	beta float64,
	yD *TensorD, y cutil.Mem) error

Forward does the forward activation function

From deep learning sdk documentation (slightly modified for gocudnn):

This routine applies a specified neuron activation function element-wise over each input value.

Note: In-place operation is allowed for this routine; i.e., x and y cutil.Mem may be equal. However, this requires xD and yD descriptors to be identical (particularly, the strides of the input and output must match for in-place operation to be allowed).

Note: All tensor formats are supported for 4 and 5 dimensions, however best performance is obtained when the strides of xD and yD are equal and HW-packed. For more than 5 dimensions the tensors must have their spatial dimensions packed.

Parameters:

---
handle(input):

previously created Handle
---
----
alpha, beta(input):

Pointers to scaling factors (in host memory) used to blend the computation result with prior
value in the output layer as follows: dstValue = alpha[0]*result + beta[0]*priorDstValue.
----
---
xD(input):

Handle to the previously initialized input tensor descriptor.
---
----
x(input):

Data pointer to GPU memory associated with the tensor descriptor xD.

----
---
yD(input):

Handle to the previously initialized output tensor descriptor.
---
----
y(output):

Data pointer to GPU memory associated with the output tensor descriptor yDesc.
----

Possible Error Returns

nil:

The function launched successfully.

CUDNN_STATUS_NOT_SUPPORTED:

The function does not support the provided configuration.

CUDNN_STATUS_BAD_PARAM:

At least one of the following conditions are met:

1)The parameter mode has an invalid enumerant value.
2)The dimensions n,c,h,w of the input tensor and output tensors differ.
3)The datatype of the input tensor and output tensors differs.
4)The strides nStride,cStride,hStride,wStride of the input tensor and output tensors differ and in-place operation is used (i.e., x and y pointers are equal).

CUDNN_STATUS_EXECUTION_FAILED:

The function failed to launch on the GPU.

func (*ActivationD) ForwardUS

func (a *ActivationD) ForwardUS(
	handle *Handle,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	beta float64,
	yD *TensorD, y unsafe.Pointer) error

ForwardUS is just like Forward but it takes unsafe.Pointers instead of cutil.Mem

func (*ActivationD) Get

func (a *ActivationD) Get() (mode ActivationMode, nan NANProp, coef float64, err error)

Get gets the descriptor descriptors values

func (*ActivationD) Set

func (a *ActivationD) Set(mode ActivationMode, nan NANProp, coef float64) error

Set sets the activation operation according to the settings passed

func (*ActivationD) String

func (a *ActivationD) String() string

type ActivationMode

type ActivationMode C.cudnnActivationMode_t

ActivationMode is used for activation discriptor flags flags are obtained through type's methods

func (*ActivationMode) ClippedRelu

func (a *ActivationMode) ClippedRelu() ActivationMode

ClippedRelu sets a to ActivationMode(C.CUDNN_ACTIVATION_CLIPPED_RELU)and returns that value.

Selects the clipped rectified linear function.

func (*ActivationMode) Elu

func (a *ActivationMode) Elu() ActivationMode

Elu sets a to ActivationMode(C.CUDNN_ACTIVATION_ELU) and returns that value.

Selects the exponential linear function.

func (*ActivationMode) Identity

func (a *ActivationMode) Identity() ActivationMode

Identity returns ActivationMode(C.CUDNN_ACTIVATION_IDENTITY) (new for 7.1)

Selects the identity function, intended for bypassing the activation step in (*Convolution)BiasActivationForward(). (The Identity flag must use CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_​PRECOMP_GEMM, and only for (*Convolution)BiasActivationForward()) Does not work with cudnnActivationForward() or cudnnActivationBackward().

func (*ActivationMode) Relu

func (a *ActivationMode) Relu() ActivationMode

Relu sets a to ActivationMode(C.CUDNN_ACTIVATION_RELU)and returns that value.

Selects the rectified linear function.

func (*ActivationMode) Sigmoid

func (a *ActivationMode) Sigmoid() ActivationMode

Sigmoid sets a to ActivationMode(C.CUDNN_ACTIVATION_SIGMOID)and returns that value.

Selects the sigmoid function.

func (ActivationMode) String

func (a ActivationMode) String() string

func (*ActivationMode) Tanh

func (a *ActivationMode) Tanh() ActivationMode

Tanh sets a to ActivationMode(C.CUDNN_ACTIVATION_TANH)and returns that value.

Selects the hyperbolic tangent function.

type Algorithm

type Algorithm C.cudnnAlgorithm_t

Algorithm is used to pass generic stuff

type AlgorithmD

type AlgorithmD struct {
	// contains filtered or unexported fields
}

AlgorithmD holds the C.cudnnAlgorithmDescriptor_t

func CreateAlgorithmDescriptor

func CreateAlgorithmDescriptor() (*AlgorithmD, error)

CreateAlgorithmDescriptor creates an AlgorithmD that needs to be set

func (*AlgorithmD) Copy

func (a *AlgorithmD) Copy() (*AlgorithmD, error)

Copy returns a copy of AlgorithmD

func (*AlgorithmD) Destroy

func (a *AlgorithmD) Destroy() error

Destroy destroys descriptor. Right now since gocudnn is on go's gc this won't do anything

func (*AlgorithmD) Get

func (a *AlgorithmD) Get() (Algorithm, error)

Get returns AlgrothmD values a Algorithm.

func (*AlgorithmD) GetAlgorithmSpaceSize

func (a *AlgorithmD) GetAlgorithmSpaceSize(handle *Handle) (uint, error)

GetAlgorithmSpaceSize gets the size in bytes of the algorithm

func (*AlgorithmD) RestoreAlgorithm

func (a *AlgorithmD) RestoreAlgorithm(handle *Handle, algoSpace cutil.Mem, sizeinbytes uint) error

RestoreAlgorithm from host

func (*AlgorithmD) SaveAlgorithm

func (a *AlgorithmD) SaveAlgorithm(handle *Handle, algoSpace cutil.Mem, sizeinbytes uint) error

SaveAlgorithm saves the algorithm to host

func (*AlgorithmD) Set

func (a *AlgorithmD) Set(algo Algorithm) error

Set sets the algorthm into the algorithmd

type AlgorithmPerformance

type AlgorithmPerformance struct {
	// contains filtered or unexported fields
}

AlgorithmPerformance go typed C.cudnnAlgorithmPerformance_t

func CreateAlgorithmPerformance

func CreateAlgorithmPerformance(numberToCreate int32) ([]AlgorithmPerformance, error)

CreateAlgorithmPerformance creates and returns an AlgorithmPerformance

returns

nil = Sucess
CUDNN_STATUS_ALLOC_FAILED - The resources could not be allocated

func (*AlgorithmPerformance) Destroy

func (a *AlgorithmPerformance) Destroy() error

Destroy destroys the perfmance

func (*AlgorithmPerformance) Get

Get gets algorithm performance. it returns AlgorithmD, Status, float32(time), SizeT(memorysize in bytes) I didn't include the setalgorithmperformance func, but it might need to be made.

func (*AlgorithmPerformance) Set

func (a *AlgorithmPerformance) Set(aD *AlgorithmD, s Status, time float32, memory uint) error

Set sets the algo performance

type AttentionD

type AttentionD struct {
	// contains filtered or unexported fields
}

AttentionD holds opaque values used for attention operations

func CreateAttnDescriptor

func CreateAttnDescriptor() (*AttentionD, error)

CreateAttnDescriptor creates an Attention Descriptor

func (*AttentionD) BackwardData

func (a *AttentionD) BackwardData(
	h *Handle,
	loWinIdx []int32,
	hiWinIdx []int32,
	seqLengthArrayDQDO []int32,
	seqLengthArrayDKDV []int32,
	doDesc *SeqDataD, dout cutil.Mem,
	dqDesc *SeqDataD, dqueries, queries cutil.Mem,
	dkDesc *SeqDataD, dkeys, keys cutil.Mem,
	dvDesc *SeqDataD, dvalues, values cutil.Mem,
	wbuffSIB uint, wbuff cutil.Mem, wspaceSIB uint, wspace cutil.Mem, rspaceSIB uint, rspace cutil.Mem) error

BackwardData does the backward propigation for data.

func (*AttentionD) BackwardDataUS

func (a *AttentionD) BackwardDataUS(
	h *Handle,
	loWinIdx []int32,
	hiWinIdx []int32,
	seqLengthArrayDQDO []int32,
	seqLengthArrayDKDV []int32,
	doDesc *SeqDataD, dout unsafe.Pointer,
	dqDesc *SeqDataD, dqueries, queries unsafe.Pointer,
	dkDesc *SeqDataD, dkeys, keys unsafe.Pointer,
	dvDesc *SeqDataD, dvalues, values unsafe.Pointer,
	wbuffSIB uint, wbuff unsafe.Pointer, wspaceSIB uint, wspace unsafe.Pointer, rspaceSIB uint, rspace unsafe.Pointer) error

BackwardDataUS is like BackwardData but uses unsafe.Pointer instead of cutil.Mem

func (*AttentionD) BackwardWeights

func (a *AttentionD) BackwardWeights(
	h *Handle,
	wgmode WgradMode,
	qDesc *SeqDataD, queries cutil.Mem,
	keyDesc *SeqDataD, keys cutil.Mem,
	vDesc *SeqDataD, values cutil.Mem,
	doDesc *SeqDataD, dout cutil.Mem,
	wbuffSIB uint, wbuff, dwbuff cutil.Mem,
	wspaceSIB uint, wspace cutil.Mem, rspaceSIB uint, rspace cutil.Mem) error

BackwardWeights does the backward propigation for weights.

func (*AttentionD) BackwardWeightsUS

func (a *AttentionD) BackwardWeightsUS(
	h *Handle,
	wgmode WgradMode,
	qDesc *SeqDataD, queries unsafe.Pointer,
	keyDesc *SeqDataD, keys unsafe.Pointer,
	vDesc *SeqDataD, values unsafe.Pointer,
	doDesc *SeqDataD, dout unsafe.Pointer,
	wbuffSIB uint, wbuff, dwbuff unsafe.Pointer,
	wspaceSIB uint, wspace unsafe.Pointer, rspaceSIB uint, rspace unsafe.Pointer) error

BackwardWeightsUS is like BackwardWeightsUS but uses unsafe.Pointer instead of cutil.Mem

func (*AttentionD) Destroy

func (a *AttentionD) Destroy() error

Destroy will destroy the descriptor if not on GC if it is on gc it will do nothing but return nil Currently, gocudnn is always on go's gc

func (*AttentionD) Forward

func (a *AttentionD) Forward(
	h *Handle,
	currIdx int32,
	loWinIdx []int32,
	hiWinIdx []int32,
	seqLengthArrayQRO []int32,
	seqLengthArrayKV []int32,
	qrDesc *SeqDataD, queries, residuals cutil.Mem,
	keyDesc *SeqDataD, keys cutil.Mem,
	vDesc *SeqDataD, values cutil.Mem,
	oDesc *SeqDataD, out cutil.Mem,
	wbuffSIB uint, wbuff cutil.Mem,
	wspaceSIB uint, wspace cutil.Mem,
	rspaceSIB uint, rspace cutil.Mem) error

Forward look at documentation. Kind of more confusing than normal if currIdx <0 trainingmode, currIdx >=0 inference mode

func (*AttentionD) ForwardUS

func (a *AttentionD) ForwardUS(
	h *Handle,
	currIdx int32,
	loWinIdx []int32,
	hiWinIdx []int32,
	seqLengthArrayQRO []int32,
	seqLengthArrayKV []int32,
	qrDesc *SeqDataD, queries, residuals unsafe.Pointer,
	keyDesc *SeqDataD, keys unsafe.Pointer,
	vDesc *SeqDataD, values unsafe.Pointer,
	oDesc *SeqDataD, out unsafe.Pointer,
	wbuffSIB uint, wbuff unsafe.Pointer,
	wspaceSIB uint, wspace unsafe.Pointer,
	rspaceSIB uint, rspace unsafe.Pointer) error

ForwardUS is like Forward but takes unsafe.Pointer's instead of cutil.Mem

func (*AttentionD) Get

func (a *AttentionD) Get() (
	qMap AttnQueryMap,
	nHead int32,
	smScaler float64,
	dtype DataType,
	computePrecision DataType,
	mtype MathType,
	attn *DropOutD,
	post *DropOutD,
	qSize, keySize, vSize int32,
	qProjSize, keyProjSize, vProjSize, oProjSize int32,
	qoMaxSeqLen, kvMaxSeqLen int32,
	maxBatchSize, maxBeamSize int32,
	err error)

Get gets all the values for the AttentionD - There is a lot.

func (*AttentionD) GetMultiHeadAttnWeights

func (a *AttentionD) GetMultiHeadAttnWeights(h *Handle, wkind MultiHeadAttnWeightKind, wbuffSIB uint, wbuff cutil.Mem) (wD *TensorD, w cutil.Mem, err error)

GetMultiHeadAttnWeights returns a Descripter for w and its goco.Mem

func (*AttentionD) GetMultiHeadBuffers

func (a *AttentionD) GetMultiHeadBuffers(h *Handle) (weightbuffSIB, wspaceSIB, rspaceSIB uint, err error)

GetMultiHeadBuffers returns the Size In Bytes (SIB) needed for allocation for operation.

func (*AttentionD) Set

func (a *AttentionD) Set(
	qMap AttnQueryMap,
	nHead int32,
	smScaler float64,
	dtype DataType,
	computePrecision DataType,
	mtype MathType,
	attn *DropOutD,
	post *DropOutD,
	qSize, keySize, vSize int32,
	qProjSize, keyProjSize, vProjSize, oProjSize int32,
	qoMaxSeqLen, kvMaxSeqLen int32,
	maxBatchSize, maxBeamSize int32,
) error

Set sets an already made AttentionD called from CreateAttnDescriptor.

type AttnQueryMap

type AttnQueryMap C.cudnnAttnQueryMap_t

AttnQueryMap type is a flag for multihead attention. Flags are exposed through type methods.

func (*AttnQueryMap) AllToOne

func (a *AttnQueryMap) AllToOne() AttnQueryMap

AllToOne - multiple Q-s when beam width > 1 map to a single (K,V) set. Method changes to AllToOne, and returns that value.

func (*AttnQueryMap) OneToOne

func (a *AttnQueryMap) OneToOne() AttnQueryMap

OneToOne - multiple Q-s when beam width > 1 map to corresponding (K,V) sets. Method changes to OneToOne, and returns that value.

func (AttnQueryMap) String

func (a AttnQueryMap) String() string

type BatchNormD

type BatchNormD struct {
	// contains filtered or unexported fields
}

BatchNormD is a gocudnn original. This is to make the batchnorm operation similar to the majority cudnn.

func CreateBatchNormDescriptor

func CreateBatchNormDescriptor() *BatchNormD

CreateBatchNormDescriptor creates a new BatchNormD

func (*BatchNormD) Backward

func (b *BatchNormD) Backward(
	handle *Handle,
	alphadata, betadata, alphaparam, betaparam float64,
	xD *TensorD, x cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	dxD *TensorD, dx cutil.Mem,
	dBnScaleBiasDesc *TensorD, scale, dscale, dbias cutil.Mem,
	epsilon float64,

	savedMean, savedInvVariance cutil.Mem,
) error

Backward - Performs backward pass of Batch Normalization layer.

 Outputs: dx (backprop data), dscale (training scale), dbias (training bias)

	Scalars: alphadata, betadata, alphaparam, betaparam - are smoothing factors. y = alpha * operation + beta * y

	Note: savedMean, savedInvVariance - These are cached results if used by the layer in the forward pass.
					    These can be null iff they are both null.

func (*BatchNormD) BackwardUS

func (b *BatchNormD) BackwardUS(
	handle *Handle,
	alphadata, betadata, alphaparam, betaparam float64,
	xD *TensorD, x unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	dxD *TensorD, dx unsafe.Pointer,
	dBnScaleBiasDesc *TensorD, scale, dscale, dbias unsafe.Pointer,
	epsilon float64,

	savedMean, savedInvVariance unsafe.Pointer,
) error

BackwardUS is like Backward but uses unsafe.Pointers instead of cutil.Mem

func (*BatchNormD) DeriveBNTensorDescriptor

func (b *BatchNormD) DeriveBNTensorDescriptor(xDesc *TensorD) (bndesc *TensorD, err error)

DeriveBNTensorDescriptor Derives a BN Tensor Descriptor from the one passed.

* Derives a tensor descriptor from layer data descriptor for BatchNormalization * scale, invVariance, bnBias, bnScale tensors. Use this tensor desc for * bnScaleBiasMeanVarDesc and bnScaleBiasDiffDesc in Batch Normalization forward and backward functions.

func (*BatchNormD) ForwardInference

func (b *BatchNormD) ForwardInference(
	handle *Handle,
	alpha, beta float64,
	xD *TensorD, x cutil.Mem,
	yD *TensorD, y cutil.Mem,
	ScaleBiasMeanVarDesc *TensorD, scale, bias, estimatedMean, estimatedVariance cutil.Mem,
	epsilon float64,

) error

ForwardInference info was pulled from cudnn documentation This function performs the forward BatchNormalization layer computation for inference phase. This layer is based on the paper "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift", S. Ioffe, C. Szegedy, 2015.

Notes:

1)Only 4D and 5D tensors are supported.

2)The input transformation performed by this function is defined as: y := alpha*y + beta *(bnScale * (x-estimatedMean)/sqrt(epsilon + estimatedVariance)+bnBias)

3)The epsilon value has to be the same during training, backpropagation and inference.

4)For training phase use cudnnBatchNormalizationForwardTraining.

5)Much higher performance when HW-packed tensors are used for all of x, dy, dx.

Parameters:

	----
	handle(input):

	Handle to a previously created cuDNN library descriptor.
	----
	---
	mode(input):

	Mode of operation (spatial or per-activation). BatchNormMode
	---
	----
	alpha, beta (input):

	Scaling factors in host mem y = alpha *result + beta *y
	----
	---
	xDesc (input), yDesc (input), x (input), y (output):

  Descriptors and pointers to mem
	---
	----
	bnScaleBiasMeanVarDesc, bnScaleData, bnBiasData(inputs):

	Tensor descriptor and pointers in device memory for
	the batch normalization scale and bias parameters
	----
	---
	estimatedMean, estimatedVariance (inputs):

	Mean and variance tensors (these have the same descriptor as the bias and scale).
	It is suggested that resultRunningMean, resultRunningVariance from the cudnnBatchNormalizationForwardTraining
	call accumulated during the training phase are passed as inputs here.
	---
	----
	epsilon(input):

	Epsilon value used in the batch normalization formula.
	Minimum allowed value is found in  MinEpsilon() method. (It is now zero)
	----

Returns:

nil - The computation was performed successfully.

CUDNN_STATUS_NOT_SUPPORTED - The function does not support the provided configuration.

CUDNN_STATUS_BAD_PARAM - At least one of the following conditions are met:

	1)One of the pointers alpha, beta, x, y, bnScaleData, bnBiasData, estimatedMean, estimatedInvVariance is NULL.
	2)Number of xDesc or yDesc tensor descriptor dimensions is not within the [4,5] range.
	3)bnScaleBiasMeanVarDesc dimensions are not 1xC(x1)x1x1 for spatial or 1xC(xD)xHxW for per-activation mode (parenthesis for 5D).
	4)epsilon value is less than CUDNN_BN_MIN_EPSILON
	5)Dimensions or data types mismatch for xDesc, yDesc

func (*BatchNormD) ForwardInferenceUS

func (b *BatchNormD) ForwardInferenceUS(
	handle *Handle,
	alpha, beta float64,
	xD *TensorD, x unsafe.Pointer,
	yD *TensorD, y unsafe.Pointer,
	ScaleBiasMeanVarDesc *TensorD, scale, bias, estimatedMean, estimatedVariance unsafe.Pointer,
	epsilon float64,

) error

ForwardInferenceUS is like ForwardInference but uses unsafe.Pointers instead of cutil.Mems

func (*BatchNormD) ForwardTraining

func (b *BatchNormD) ForwardTraining(
	handle *Handle,
	alpha float64,
	beta float64,
	xD *TensorD,
	x cutil.Mem,
	yD *TensorD,
	y cutil.Mem,

	bnScaleBiasMeanVar *TensorD,

	scale cutil.Mem,
	bias cutil.Mem,

	expAveFactor float64,

	resultrunningmean cutil.Mem,

	resultRunningVariance cutil.Mem,
	epsilon float64,
	resultSaveMean cutil.Mem,
	resultSaveInvVariance cutil.Mem,

) error

ForwardTraining from the documentation This function performs the forward BatchNormalization layer computation for training phase.

Notes:

1)Only 4D and 5D tensors are supported.

2)The epsilon value has to be the same during training, backpropagation and inference.

3)For inference phase use cudnnBatchNormalizationForwardInference.

4)Much higher performance for HW-packed tensors for both x and y.

Parameters:

----
handle:

Handle to a previously created cuDNN library descriptor.
----
---
alpha, beta (Inputs):

Scaling Factors y= alpha*opresult + beta*y
---
----
xD, yD, x, y:

Tensor descriptors and pointers in device memory for the layer's x and y data.
----
---
bnScaleBiasMeanVar:

Shared tensor descriptor desc for all the 6 tensors below in the argument list.
The dimensions for this tensor descriptor are dependent on the normalization mode.
---
----
scal, bias(Inputs):

Pointers in device memory for the batch normalization scale and bias parameters.
Note: Since bias isn't used during the backward pass.  You can use bias for other batchnorm layers.
----
---
expAveFactor (input):

Factor used in the moving average computation runningMean = newMean*factor + runningMean*(1-factor).
Use a factor=1/(1+n) at N-th call to the function to get Cumulative Moving Average (CMA) behavior CMA[n] = (x[1]+...+x[n])/n.
Since CMA[n+1] = (n*CMA[n]+x[n+1])/(n+1)= ((n+1)*CMA[n]-CMA[n])/(n+1) + x[n+1]/(n+1) = CMA[n]*(1-1/(n+1))+x[n+1]*1/(n+1)
---
----
resultRunningMean,resultRunningVariance (input/output):

Running mean and variance tensors (these have the same descriptor as the bias and scale).
Both of these pointers can be NULL but only at the same time.
The value stored in resultRunningVariance (or passed as an input in inference mode) is the moving average of variance[x]
where variance is computed either over batch or spatial+batch dimensions depending on the mode.
If these pointers are not NULL, the tensors should be initialized to some reasonable values or to 0.
----
---
epsilon:

Epsilon value used in the batch normalization formula. Minimum allowed value is CUDNN_BN_MIN_EPSILON defined in cudnn.h.
Same epsilon value should be used in forward and backward functions.
---
----
resultSaveMean, resultSaveInvVariance (outputs):

Optional cache to save intermediate results computed during the forward pass
these can then be reused to speed up the backward pass.
For this to work correctly, the bottom layer data has to remain unchanged until the backward function is called.
Note that both of these parameters can be NULL but only at the same time.
It is recommended to use this cache since memory overhead is relatively small because these tensors have a much lower product of dimensions than the data tensors.
----

Returns:

nil - The computation was performed successfully.

CUDNN_STATUS_NOT_SUPPORTED - The function does not support the provided configuration.

CUDNN_STATUS_BAD_PARAM - At least one of the following conditions are met:

	1)One of the pointers alpha, beta, x, y, bnScaleData, bnBiasData is NULL.
	2)Number of xDesc or yDesc tensor descriptor dimensions is not within the [4,5] range.
	3)bnScaleBiasMeanVarDesc dimensions are not 1xC(x1)x1x1 for spatial or 1xC(xD)xHxW for per-activation mode (parens for 5D).
	4)Exactly one of resultSaveMean, resultSaveInvVariance pointers is NULL.
	5)Exactly one of resultRunningMean, resultRunningInvVariance pointers is NULL.
	6)epsilon value is less than MinEpsilon()
	7)Dimensions or data types mismatch for xDesc, yDesc

func (*BatchNormD) ForwardTrainingUS

func (b *BatchNormD) ForwardTrainingUS(
	handle *Handle,
	alpha float64,
	beta float64,
	xD *TensorD,
	x unsafe.Pointer,
	yD *TensorD,
	y unsafe.Pointer,

	bnScaleBiasMeanVar *TensorD,

	scale unsafe.Pointer,
	bias unsafe.Pointer,

	expAveFactor float64,

	resultrunningmean unsafe.Pointer,

	resultRunningVariance unsafe.Pointer,
	epsilon float64,
	resultSaveMean unsafe.Pointer,
	resultSaveInvVariance unsafe.Pointer,

) error

ForwardTrainingUS is just like ForwardTraining but uses unsafe.Pointers.

func (*BatchNormD) Get

func (b *BatchNormD) Get() (mode BatchNormMode, err error)

Get gets the values stored in BatchNormMode

func (*BatchNormD) MinEpsilon

func (b *BatchNormD) MinEpsilon() float64

MinEpsilon is the Minimum Epsilon required. It is now zero, but it used to be 1e-5

func (*BatchNormD) Set

func (b *BatchNormD) Set(mode BatchNormMode) error

Set sets the values used in the batchnorm descriptor

func (*BatchNormD) String

func (b *BatchNormD) String() string

type BatchNormDEx

type BatchNormDEx struct {
	// contains filtered or unexported fields
}

BatchNormDEx is a gocudnn original. This is to make the batchnorm operation similar to the majority cudnn.

func CreateBatchNormDescriptorEx

func CreateBatchNormDescriptorEx() *BatchNormDEx

CreateBatchNormDescriptorEx creates a new BatchNormDEx

func (*BatchNormDEx) Backward

func (b *BatchNormDEx) Backward(
	h *Handle,
	alphadata, betadata, alphaparam, betaparam float64,
	xD *TensorD,
	x cutil.Mem,
	yD *TensorD,
	y cutil.Mem,
	dyD *TensorD,
	dy cutil.Mem,
	dzD *TensorD,
	dz cutil.Mem,
	dxD *TensorD,
	dx cutil.Mem,
	dbnScaleBiasMeanVarDesc *TensorD,
	scale cutil.Mem,
	bias cutil.Mem,
	dscale cutil.Mem,
	dbias cutil.Mem,
	epsilon float64,
	fromresultSaveMean cutil.Mem,
	fromreslutSaveInVariance cutil.Mem,
	actD *ActivationD,
	wspace cutil.Mem,
	wspacesib uint,
	rspace cutil.Mem,
	rspacesib uint,
) error

Backward does the backward ex algorithm.

func (*BatchNormDEx) BackwardUS

func (b *BatchNormDEx) BackwardUS(
	h *Handle,
	alphadata, betadata, alphaparam, betaparam float64,
	xD *TensorD,
	x unsafe.Pointer,
	yD *TensorD,
	y unsafe.Pointer,
	dyD *TensorD,
	dy unsafe.Pointer,
	dzD *TensorD,
	dz unsafe.Pointer,
	dxD *TensorD,
	dx unsafe.Pointer,
	dbnScaleBiasMeanVarDesc *TensorD,
	scale unsafe.Pointer,
	bias unsafe.Pointer,
	dscale unsafe.Pointer,
	dbias unsafe.Pointer,
	epsilon float64,
	fromresultSaveMean unsafe.Pointer,
	fromreslutSaveInVariance unsafe.Pointer,
	actD *ActivationD,
	wspace unsafe.Pointer,
	wspacesib uint,
	rspace unsafe.Pointer,
	rspacesib uint,
) error

BackwardUS is just like Backward but uses unsafe.Pointers instead of cutil.Mem.

func (*BatchNormDEx) DeriveBNTensorDescriptor

func (b *BatchNormDEx) DeriveBNTensorDescriptor(xDesc *TensorD) (bndesc *TensorD, err error)

DeriveBNTensorDescriptor derives a tensor used for the batch norm operation

func (*BatchNormDEx) ForwardInference

func (b *BatchNormDEx) ForwardInference(
	handle *Handle,
	alpha, beta float64,
	xD *TensorD,
	x cutil.Mem,
	yD *TensorD,
	y cutil.Mem,
	ScaleBiasMeanVarDesc *TensorD,
	scale, bias, estimatedMean, estimatedVariance cutil.Mem,
	epsilon float64,

) error

ForwardInference info was pulled from cudnn documentation

This function performs the forward BatchNormalization layer computation for inference phase. This layer is based on the paper "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift", S. Ioffe, C. Szegedy, 2015. Note: Only 4D and 5D tensors are supported. Note: The input transformation performed by this function is defined as: y := alpha*y + beta *(bnScale * (x-estimatedMean)/sqrt(epsilon + estimatedVariance)+bnBias) Note: The epsilon value has to be the same during training, backpropagation and inference. Note: For training phase use cudnnBatchNormalizationForwardTraining. Note: Much higher performance when HW-packed tensors are used for all of x, dy, dx.

Parameters::

handle(input): Handle to a previously created cuDNN library descriptor.

mode(input): Mode of operation (spatial or per-activation). BatchNormMode

alpha, beta (input): Pointers to scaling factors (in host memory) used to blend the layer output value with prior value in the destination tensor as follows:

dstValue = alpha[0]*resultValue + beta[0]*priorDstValue. Please refer to this section for additional details.

xDesc, yDesc, x, y: Tensor descriptors and pointers in device memory for the layer's x and y data.

bnScaleBiasMeanVarDesc, bnScaleData, bnBiasData(inputs): Tensor descriptor and pointers in device memory for the batch normalization scale and bias parameters

(in the original paper bias is referred to as beta and scale as gamma).

estimatedMean, estimatedVariance (inputs): Mean and variance tensors (these have the same descriptor as the bias and scale).

It is suggested that resultRunningMean, resultRunningVariance from the cudnnBatchNormalizationForwardTraining
call accumulated during the training phase are passed as inputs here.

epsilon(input): Epsilon value used in the batch normalization formula. Minimum allowed value is CUDNN_BN_MIN_EPSILON defined in cudnn.h.

Possible error values returned by this function and their meanings are listed below.

Returns

CUDNN_STATUS_SUCCESS

The computation was performed successfully.

CUDNN_STATUS_NOT_SUPPORTED

The function does not support the provided configuration.

CUDNN_STATUS_BAD_PARAM

At least one of the following conditions are met:

    One of the pointers alpha, beta, x, y, bnScaleData, bnBiasData, estimatedMean, estimatedInvVariance is NULL.
    Number of xDesc or yDesc tensor descriptor dimensions is not within the [4,5] range.
    bnScaleBiasMeanVarDesc dimensions are not 1xC(x1)x1x1 for spatial or 1xC(xD)xHxW for per-activation mode (parenthesis for 5D).
    epsilon value is less than CUDNN_BN_MIN_EPSILON
    Dimensions or data types mismatch for xDesc, yDesc

* Performs Batch Normalization during Inference: * y[i] = bnScale[k]*(x[i]-estimatedMean[k])/sqrt(epsilon+estimatedVariance[k]) + bnBias[k] * with bnScale, bnBias, runningMean, runningInvVariance tensors indexed * according to spatial or per-activation mode. Refer to cudnnBatchNormalizationForwardTraining * above for notes on function arguments.

func (*BatchNormDEx) ForwardInferenceUS

func (b *BatchNormDEx) ForwardInferenceUS(
	handle *Handle,
	alpha, beta float64,
	xD *TensorD,
	x unsafe.Pointer,
	yD *TensorD,
	y unsafe.Pointer,
	ScaleBiasMeanVarDesc *TensorD,
	scale, bias, estimatedMean, estimatedVariance unsafe.Pointer,
	epsilon float64,

) error

ForwardInferenceUS is just like ForwardInference but uses unsafe.Pointers instead of cutil.Mem

func (*BatchNormDEx) ForwardTraining

func (b *BatchNormDEx) ForwardTraining(
	h *Handle,
	alpha, beta float64,
	xD *TensorD,
	x cutil.Mem,
	zD *TensorD,
	z cutil.Mem,
	yD *TensorD,
	y cutil.Mem,
	bnScaleBiasMeanVarDesc *TensorD,
	scale cutil.Mem,
	bias cutil.Mem,
	expoAverageFactor float64,
	resultRunningMean cutil.Mem,
	resultRunningVariance cutil.Mem,
	epsilon float64,
	resultSaveMean cutil.Mem,
	reslutSaveInVariance cutil.Mem,
	actD *ActivationD,
	wspace cutil.Mem,
	wspacesib uint,
	rspace cutil.Mem,
	rspacesib uint,
) error

ForwardTraining does the forward training ex algorithm.

func (*BatchNormDEx) ForwardTrainingUS

func (b *BatchNormDEx) ForwardTrainingUS(
	h *Handle,
	alpha, beta float64,
	xD *TensorD,
	x unsafe.Pointer,
	zD *TensorD,
	z unsafe.Pointer,
	yD *TensorD,
	y unsafe.Pointer,
	bnScaleBiasMeanVarDesc *TensorD,
	scale unsafe.Pointer,
	bias unsafe.Pointer,
	expoAverageFactor float64,
	resultRunningMean unsafe.Pointer,
	resultRunningVariance unsafe.Pointer,
	epsilon float64,
	resultSaveMean unsafe.Pointer,
	reslutSaveInVariance unsafe.Pointer,
	actD *ActivationD,
	wspace unsafe.Pointer,
	wspacesib uint,
	rspace unsafe.Pointer,
	rspacesib uint,
) error

ForwardTrainingUS is loke ForwardTraining but using unsafe.Pointers instead of cutil.Mems

func (*BatchNormDEx) GeBackwardWorkspaceSize

func (b *BatchNormDEx) GeBackwardWorkspaceSize(
	h *Handle,
	xD, yD, dyD, dzD, dxD, dbnScaleBiasMeanVarDesc *TensorD,
	actD *ActivationD,
) (wspaceSIB uint, err error)

GeBackwardWorkspaceSize gets the workspace size in bytes for the backward operation

func (*BatchNormDEx) Get

func (b *BatchNormDEx) Get() (mode BatchNormMode, op BatchNormOps, err error)

Get gets the BatchNormMode and BatchNormOps held in the descriptor

func (*BatchNormDEx) GetForwardTrainingWorkspaceSize

func (b *BatchNormDEx) GetForwardTrainingWorkspaceSize(h *Handle,
	mode BatchNormMode,
	op BatchNormOps,
	xD, zD, yD, bnScaleBiasMeanVarDesc *TensorD,
	actD *ActivationD) (wspaceSIB uint, err error)

GetForwardTrainingWorkspaceSize gets the forward training ex workspacesize

func (*BatchNormDEx) GetTrainingReserveSpaceSize

func (b *BatchNormDEx) GetTrainingReserveSpaceSize(h *Handle,
	actD *ActivationD,
	xD *TensorD,
) (rspaceSIB uint, err error)

GetTrainingReserveSpaceSize gets the reserve space size for ex operation

func (*BatchNormDEx) MinEpsilon

func (b *BatchNormDEx) MinEpsilon() float64

MinEpsilon is the Minimum Epsilon required. It is now zero, but it used to be 1e-5

func (*BatchNormDEx) Set

func (b *BatchNormDEx) Set(mode BatchNormMode, op BatchNormOps) error

Set sets the BatchNormMode and BatchNormOps held in the descriptor

func (*BatchNormDEx) String

func (b *BatchNormDEx) String() string

type BatchNormMode

type BatchNormMode C.cudnnBatchNormMode_t

BatchNormMode used for BatchNormMode Flags

func (*BatchNormMode) PerActivation

func (b *BatchNormMode) PerActivation() BatchNormMode

PerActivation sets b to BatchNormMode(C.CUDNN_BATCHNORM_PER_ACTIVATION) and returns that new value Normalization is performed per-activation. This mode is intended to be used after the non-convolutional network layers. In this mode, the tensor dimensions of bnBias and bnScale and the parameters used in the cudnnBatchNormalization* functions, are 1xCxHxW.

func (*BatchNormMode) Spatial

func (b *BatchNormMode) Spatial() BatchNormMode

Spatial sets b to BatchNormMode(C.CUDNN_BATCHNORM_SPATIAL) and returns that new value. Normalization is performed over N+spatial dimensions. This mode is intended for use after convolutional layers (where spatial invariance is desired). In this mode the bnBias and bnScale tensor dimensions are 1xCx1x1.

func (*BatchNormMode) SpatialPersistent

func (b *BatchNormMode) SpatialPersistent() BatchNormMode

SpatialPersistent sets b to BatchNormMode(C.CUDNN_BATCHNORM_SPATIAL_PERSISTENT) and returns that new value This mode is similar to CUDNN_BATCHNORM_SPATIAL but it can be faster for some tasks.

func (BatchNormMode) String

func (b BatchNormMode) String() string

type BatchNormOps

type BatchNormOps C.cudnnBatchNormOps_t

BatchNormOps are flags for BatchNormOps when needed

func (*BatchNormOps) Activation

func (b *BatchNormOps) Activation() BatchNormOps

Activation sets b to BatchNormOps(C.CUDNN_BATCHNORM_OPS_BN_ACTIVATION) /* do batchNorm, then activation */

func (*BatchNormOps) AddActivation

func (b *BatchNormOps) AddActivation() BatchNormOps

AddActivation sets b to BatchNormOps(C.CUDNN_BATCHNORM_OPS_BN_ADD_ACTIVATION) /* do batchNorm, then elemWiseAdd, then activation */

func (*BatchNormOps) Normal

func (b *BatchNormOps) Normal() BatchNormOps

Normal sets b to BatchNormOps(C.CUDNN_BATCHNORM_OPS_BN) and returns that new value /* do batch normalization only */

func (BatchNormOps) String

func (b BatchNormOps) String() string

type CTCLossAlgo

type CTCLossAlgo C.cudnnCTCLossAlgo_t

CTCLossAlgo used to hold flags

func (CTCLossAlgo) Algo

func (c CTCLossAlgo) Algo() Algorithm

Algo returns an algo

func (*CTCLossAlgo) Deterministic

func (c *CTCLossAlgo) Deterministic() CTCLossAlgo

Deterministic sets c to and returns CTCLossAlgo(C.CUDNN_CTC_LOSS_ALGO_DETERMINISTIC)

func (*CTCLossAlgo) NonDeterministic

func (c *CTCLossAlgo) NonDeterministic() CTCLossAlgo

NonDeterministic sets c to and returns CTCLossAlgo(C.CUDNN_CTC_LOSS_ALGO_NON_DETERMINISTIC) Flag

func (CTCLossAlgo) String

func (c CTCLossAlgo) String() string

type CTCLossD

type CTCLossD struct {
	// contains filtered or unexported fields
}

CTCLossD holdes the C.cudnnCTCLossDescriptor_t

func CreateCTCLossDescriptor

func CreateCTCLossDescriptor() (*CTCLossD, error)

CreateCTCLossDescriptor creates

func (*CTCLossD) CTCLoss

func (c *CTCLossD) CTCLoss(
	handle *Handle,
	probsD *TensorD,
	probs cutil.Mem,
	labels []int32,
	labelLengths []int32,
	inputLengths []int32,
	costs cutil.Mem,
	gradientsD *TensorD,
	gradients cutil.Mem,
	algo CTCLossAlgo,
	wspace cutil.Mem,
	wspacesize uint,
) error

CTCLoss calculates loss

func (*CTCLossD) CTCLossUS

func (c *CTCLossD) CTCLossUS(
	handle *Handle,
	probsD *TensorD, probs unsafe.Pointer,
	labels []int32,
	labelLengths []int32,
	inputLengths []int32,
	costs unsafe.Pointer,
	gradientsD *TensorD, gradients unsafe.Pointer,
	algo CTCLossAlgo,
	wspace unsafe.Pointer, wspacesize uint,
) error

CTCLossUS is like CTCLoss but uses unsafe.Pointer instead of cutil.Mem

func (*CTCLossD) Destroy

func (c *CTCLossD) Destroy() error

Destroy destroys the descriptor inside CTCLossD if go's gc is not in use. if gc is being used destroy will just return nil

func (*CTCLossD) Get

func (c *CTCLossD) Get() (DataType, error)

Get returns the datatype and error

func (*CTCLossD) GetWorkspaceSize

func (c *CTCLossD) GetWorkspaceSize(
	handle *Handle,
	probsD *TensorD,
	gradientsD *TensorD,
	labels []int32,
	labelLengths []int32,
	inputLengths []int32,
	algo CTCLossAlgo,
) (uint, error)

GetWorkspaceSize calculates workspace size

func (*CTCLossD) Set

func (c *CTCLossD) Set(data DataType) error

Set sets the CTCLossD

type ConvBwdDataAlgo

type ConvBwdDataAlgo C.cudnnConvolutionBwdDataAlgo_t

ConvBwdDataAlgo used for flags in the bacward data algorithms exposing them through methods

func (ConvBwdDataAlgo) Algo

func (c ConvBwdDataAlgo) Algo() Algorithm

Algo returns an Algorithm struct

func (*ConvBwdDataAlgo) Algo0

func (c *ConvBwdDataAlgo) Algo0() ConvBwdDataAlgo

Algo0 sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_0) and returns value of c /* non-deterministic */

func (*ConvBwdDataAlgo) Algo1

func (c *ConvBwdDataAlgo) Algo1() ConvBwdDataAlgo

Algo1 sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_1) and returns value of c

func (*ConvBwdDataAlgo) Count

func (c *ConvBwdDataAlgo) Count() ConvBwdDataAlgo

Count sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_COUNT) and returns value of c

func (*ConvBwdDataAlgo) FFT

FFT sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_FFT) and returns value of c

func (*ConvBwdDataAlgo) FFTTiling

func (c *ConvBwdDataAlgo) FFTTiling() ConvBwdDataAlgo

FFTTiling sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_FFT_TILING) and returns value of c

func (ConvBwdDataAlgo) String

func (c ConvBwdDataAlgo) String() string

func (*ConvBwdDataAlgo) Winograd

func (c *ConvBwdDataAlgo) Winograd() ConvBwdDataAlgo

Winograd sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_WINOGRAD) and returns value of c

func (*ConvBwdDataAlgo) WinogradNonFused

func (c *ConvBwdDataAlgo) WinogradNonFused() ConvBwdDataAlgo

WinogradNonFused sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_WINOGRAD_NONFUSED) and returns value of c

type ConvBwdDataAlgoPerformance

type ConvBwdDataAlgoPerformance struct {
	Algo        ConvBwdDataAlgo `json:"algo,omitempty"`
	Status      Status          `json:"status,omitempty"`
	Time        float32         `json:"time,omitempty"`
	Memory      uint            `json:"memory,omitempty"`
	Determinism Determinism     `json:"determinism,omitempty"`
	MathType    MathType        `json:"math_type,omitempty"`
}

ConvBwdDataAlgoPerformance is the return struct in the finding algorithm funcs

func (ConvBwdDataAlgoPerformance) String

func (cb ConvBwdDataAlgoPerformance) String() string

type ConvBwdDataPref

type ConvBwdDataPref C.cudnnConvolutionBwdDataPreference_t

ConvBwdDataPref used for flags on bwddatapref exposing them through methods

func (*ConvBwdDataPref) NoWorkSpace

func (c *ConvBwdDataPref) NoWorkSpace() ConvBwdDataPref

NoWorkSpace sets c to returns ConvBwdDataPref( C.CUDNN_CONVOLUTION_FWD_NO_WORKSPACE) and returns value of c

func (*ConvBwdDataPref) PreferFastest

func (c *ConvBwdDataPref) PreferFastest() ConvBwdDataPref

PreferFastest sets c to ConvBwdDataPref( C.CUDNN_CONVOLUTION_FWD_PREFER_FASTEST) and returns value of c

func (*ConvBwdDataPref) SpecifyWorkSpaceLimit

func (c *ConvBwdDataPref) SpecifyWorkSpaceLimit() ConvBwdDataPref

SpecifyWorkSpaceLimit sets c to ConvBwdDataPref( C.CUDNN_CONVOLUTION_FWD_SPECIFY_WORKSPACE_LIMIT)and returns value of c

type ConvBwdFiltAlgo

type ConvBwdFiltAlgo C.cudnnConvolutionBwdFilterAlgo_t

ConvBwdFiltAlgo Used for ConvBwdFiltAlgo flags exposing them through methods

func (ConvBwdFiltAlgo) Algo

func (c ConvBwdFiltAlgo) Algo() Algorithm

Algo returns an Algorithm Struct

func (*ConvBwdFiltAlgo) Algo0

func (c *ConvBwdFiltAlgo) Algo0() ConvBwdFiltAlgo

Algo0 sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_0) and returns value of c /* non-deterministic */

func (*ConvBwdFiltAlgo) Algo1

func (c *ConvBwdFiltAlgo) Algo1() ConvBwdFiltAlgo

Algo1 sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_1) and returns value of c

func (*ConvBwdFiltAlgo) Algo3

func (c *ConvBwdFiltAlgo) Algo3() ConvBwdFiltAlgo

Algo3 sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_3) and returns value of c

func (*ConvBwdFiltAlgo) Count

func (c *ConvBwdFiltAlgo) Count() ConvBwdFiltAlgo

Count sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_COUNT) and returns value of c

func (*ConvBwdFiltAlgo) FFT

FFT sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_FFT) and returns value of c

func (*ConvBwdFiltAlgo) FFTTiling

func (c *ConvBwdFiltAlgo) FFTTiling() ConvBwdFiltAlgo

FFTTiling sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_FFT_TILING) and returns value of c

func (ConvBwdFiltAlgo) String

func (c ConvBwdFiltAlgo) String() string

func (*ConvBwdFiltAlgo) Winograd

func (c *ConvBwdFiltAlgo) Winograd() ConvBwdFiltAlgo

Winograd sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_WINOGRAD) and returns value of c

func (*ConvBwdFiltAlgo) WinogradNonFused

func (c *ConvBwdFiltAlgo) WinogradNonFused() ConvBwdFiltAlgo

WinogradNonFused sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_WINOGRAD_NONFUSED) and returns value of c

type ConvBwdFiltAlgoPerformance

type ConvBwdFiltAlgoPerformance struct {
	Algo        ConvBwdFiltAlgo `json:"algo,omitempty"`
	Status      Status          `json:"status,omitempty"`
	Time        float32         `json:"time,omitempty"`
	Memory      uint            `json:"memory,omitempty"`
	Determinism Determinism     `json:"determinism,omitempty"`
	MathType    MathType        `json:"math_type,omitempty"`
}

ConvBwdFiltAlgoPerformance is the return struct in the finding algorithm funcs

func (ConvBwdFiltAlgoPerformance) String

func (cb ConvBwdFiltAlgoPerformance) String() string

type ConvBwdFilterPref

type ConvBwdFilterPref C.cudnnConvolutionBwdFilterPreference_t

ConvBwdFilterPref are used for flags for the backwds filters exposing them through methods

func (*ConvBwdFilterPref) NoWorkSpace

func (c *ConvBwdFilterPref) NoWorkSpace() ConvBwdFilterPref

NoWorkSpace sets c to ConvBwdFilterPref( C.CUDNN_CONVOLUTION_BWD_FILTER_NO_WORKSPACE) and returns value of c

func (*ConvBwdFilterPref) PreferFastest

func (c *ConvBwdFilterPref) PreferFastest() ConvBwdFilterPref

PreferFastest sets c to ConvBwdFilterPref( C.CUDNN_CONVOLUTION_BWD_FILTER_PREFER_FASTEST) and returns value of c

func (*ConvBwdFilterPref) SpecifyWorkSpaceLimit

func (c *ConvBwdFilterPref) SpecifyWorkSpaceLimit() ConvBwdFilterPref

SpecifyWorkSpaceLimit sets c to ConvBwdFilterPref( C.CUDNN_CONVOLUTION_BWD_FILTER_SPECIFY_WORKSPACE_LIMIT) and returns value of c

type ConvFwdAlgo

type ConvFwdAlgo C.cudnnConvolutionFwdAlgo_t

ConvFwdAlgo flags for cudnnConvFwdAlgo_t exposing them through methods

func (ConvFwdAlgo) Algo

func (c ConvFwdAlgo) Algo() Algorithm

Algo returns an Algorithm Struct

func (*ConvFwdAlgo) Count

func (c *ConvFwdAlgo) Count() ConvFwdAlgo

Count sets c to ConvFwdAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_COUNT) and returns value of c

func (*ConvFwdAlgo) Direct

func (c *ConvFwdAlgo) Direct() ConvFwdAlgo

Direct sets c to ConvFwdAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_DIRECT) and returns value of c

func (*ConvFwdAlgo) FFT

func (c *ConvFwdAlgo) FFT() ConvFwdAlgo

FFT sets c to ConvFwdAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_FFT) and returns value of c

func (*ConvFwdAlgo) FFTTiling

func (c *ConvFwdAlgo) FFTTiling() ConvFwdAlgo

FFTTiling sets c to ConvFwdAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_FFT_TILING) and returns value of c

func (*ConvFwdAlgo) Gemm

func (c *ConvFwdAlgo) Gemm() ConvFwdAlgo

Gemm sets c to ConvFwdAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_GEMM) and returns value of c

func (*ConvFwdAlgo) ImplicitGemm

func (c *ConvFwdAlgo) ImplicitGemm() ConvFwdAlgo

ImplicitGemm sets c to ConvFwdAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM) and returns value of c

func (*ConvFwdAlgo) ImplicitPrecompGemm

func (c *ConvFwdAlgo) ImplicitPrecompGemm() ConvFwdAlgo

ImplicitPrecompGemm sets c to ConvFwdAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM) and returns value of c

func (ConvFwdAlgo) String

func (c ConvFwdAlgo) String() string

func (*ConvFwdAlgo) WinoGrad

func (c *ConvFwdAlgo) WinoGrad() ConvFwdAlgo

WinoGrad sets c to ConvFwdAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD) and returns value of c

func (*ConvFwdAlgo) WinoGradNonFused

func (c *ConvFwdAlgo) WinoGradNonFused() ConvFwdAlgo

WinoGradNonFused sets c to ConvFwdAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD_NONFUSED) and returns value of c

type ConvFwdAlgoPerformance

type ConvFwdAlgoPerformance struct {
	Algo        ConvFwdAlgo `json:"algo,omitempty"`
	Status      Status      `json:"status,omitempty"`
	Time        float32     `json:"time,omitempty"`
	Memory      uint        `json:"memory,omitempty"`
	Determinism Determinism `json:"determinism,omitempty"`
	MathType    MathType    `json:"math_type,omitempty"`
}

ConvFwdAlgoPerformance is a struct that holds the performance of the algorithm

func (ConvFwdAlgoPerformance) String

func (cb ConvFwdAlgoPerformance) String() string

type ConvolutionD

type ConvolutionD struct {
	// contains filtered or unexported fields
}

ConvolutionD sets all the convolution info

func CreateConvolutionDescriptor

func CreateConvolutionDescriptor() (*ConvolutionD, error)

CreateConvolutionDescriptor creates a convolution descriptor

func (*ConvolutionD) BackwardBias

func (c *ConvolutionD) BackwardBias(
	handle *Handle,
	alpha float64,
	dyD *TensorD,
	dy cutil.Mem,
	beta float64,
	dbD *TensorD,
	db cutil.Mem) error

BackwardBias is used to compute the bias gradient for batch convolution db is returned

func (*ConvolutionD) BackwardBiasUS

func (c *ConvolutionD) BackwardBiasUS(
	handle *Handle,
	alpha float64,
	dyD *TensorD, dy unsafe.Pointer,
	beta float64,
	dbD *TensorD, db unsafe.Pointer) error

BackwardBiasUS is like BackwardBias but using unsafe.Pointer instead of cutil.Mem

func (*ConvolutionD) BackwardData

func (c *ConvolutionD) BackwardData(
	handle *Handle,
	alpha float64,
	wD *FilterD, w cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	algo ConvBwdDataAlgo,
	wspace cutil.Mem, wspaceSIB uint,
	beta float64,
	dxD *TensorD, dx cutil.Mem,
) error

BackwardData does the backwards convolution on data

This function computes the convolution data gradient of the tensor dy, where y is the output of the forward convolution in (*ConvolutionD)Forward(). It uses the specified algo, and returns the results in the output tensor dx. Scaling factors alpha and beta can be used to scale the computed result or accumulate with the current dx.

Parameters:

---
handle(input):

previously created Handle
---
----
alpha, beta(input):

Pointers to scaling factors (in host memory) used to blend the computation result with prior
value in the output layer as follows: dstValue = alpha[0]*result + beta[0]*priorDstValue.
----
---
wD(input):

For previously set input tensor descriptor.
---
----
w(input):

Data pointer to GPU memory associated with the tensor descriptor xD.

----
---
dyD(input):

For previously set input tensor descriptor of dy.
---
----
dy(input):

Data pointer to GPU memory associated with the input tensor desctiptor.(Holds back propigation errors)
----
---
algo(input):

Enumerant that specifies which backward data convolution algorithm shoud be used to compute the results.
---
----
wspace, wspaceSIB(inputs):

Data pointer and size in bytes of workspace needed for algo passed. If no wspace is need nil can be passed.
----
---
dxD(input):
For previously set output tensor descriptor of dx.
---
----
dx(input/output):
Data pointer to GPU memory associated with the output tensor desctiptor.(Holds back propigation errors for layer it received its forward inputs.)
----

Supported Configurations

----
Config: "TRUE_HALF_CONFIG (only compute capability 5.3 and later)."
TensorD (wD,dyD,dxD): (*DataType)Half()
ConvolutionD: (*DataType)Half()
----
---
Config: "PSEUDO_HALF_CONFIG"
TensorD (wD,dyD,dxD): (*DataType)Half()
ConvolutionD: (*DataType)Float()
---
----
Config: "FLOAT_CONFIG"
TensorD (wD,dyD,dxD): (*DataType)Float()
ConvolutionD: (*DataType)Float()
----
---
Config: "DOUBLE_CONFIG"
TensorD (wD,dyD,dxD): (*DataType)Double()
ConvolutionD: (*DataType)Double()
---

Note: Specifying a separate algorithm can cause changes in performance, support and computation determinism.

Table of algorithm with configs can be found at. (gocudnn flag names are similar to cudnn)

https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html#cudnnConvolutionBackwardData

Possible Error Returns:

nil:

The function launched successfully.

CUDNN_STATUS_NOT_SUPPORTED:

At least one of the following conditions are met:
1)	dyD or dxD have negative tensor striding
2)	dyD, wD or dxD has a number of dimensions that is not 4 or 5
3)	The chosen algo does not support the parameters provided; see above for exhaustive list of parameter support for each algo
4)	dyD or wD indicate an output channel count that isn't a multiple of group count (if group count has been set in ConvolutionD).

CUDNN_STATUS_BAD_PARAM:

At least one of the following conditions are met:
1)	At least one of the following is NULL: handle, dyD, wD, ConvolutionD, dxD, dy, w, dx, alpha, beta
2)	wD and dyD have a non-matching number of dimensions
3)	wD and dxD have a non-matching number of dimensions
4)	wD has fewer than three number of dimensions
5)	wD, dxD and dyD have a non-matching data type.
6)	wD and dxD have a non-matching number of input feature maps per image (or group in case of Grouped Convolutions).
7)	dyD's spatial sizes do not match with the expected size as determined by (*ConvolutionD)GetOutputDims().

CUDNN_STATUS_MAPPING_ERROR:

An error occurs during the texture binding of the filter data or the input differential tensor data

CUDNN_STATUS_EXECUTION_FAILED:

The function failed to launch on the GPU.

func (*ConvolutionD) BackwardDataUS

func (c *ConvolutionD) BackwardDataUS(
	handle *Handle,
	alpha float64,
	wD *FilterD, w unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	algo ConvBwdDataAlgo,
	wspace unsafe.Pointer, wspacesize uint,
	beta float64,
	dxD *TensorD, dx unsafe.Pointer,
) error

BackwardDataUS is like BackwardData but uses unsafe.Pointer instead of cutil.Mem

func (*ConvolutionD) BackwardFilter

func (c *ConvolutionD) BackwardFilter(
	handle *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	algo ConvBwdFiltAlgo,
	wspace cutil.Mem, wspacesize uint,
	beta float64,
	dwD *FilterD, dw cutil.Mem,
) error

BackwardFilter does the backwards convolution

func (*ConvolutionD) BackwardFilterUS

func (c *ConvolutionD) BackwardFilterUS(
	handle *Handle,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	algo ConvBwdFiltAlgo,
	wspace unsafe.Pointer, wspacesize uint,
	beta float64,
	dwD *FilterD, dw unsafe.Pointer,
) error

BackwardFilterUS is like BackwardFilter but using unsafe.Pointer instead of cutil.Mem

func (*ConvolutionD) BiasActivationForward

func (c *ConvolutionD) BiasActivationForward(
	handle *Handle,
	alpha1 float64,
	xD *TensorD, x cutil.Mem,
	wD *FilterD, w cutil.Mem,
	algo ConvFwdAlgo,
	wspace cutil.Mem,
	wspacesize uint,
	alpha2 float64,
	zD *TensorD, z cutil.Mem,
	biasD *TensorD, bias cutil.Mem,
	aD *ActivationD,
	yD *TensorD, y cutil.Mem,
) error

BiasActivationForward info can be found at:

https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html#cudnnConvolutionBiasActivationForward

Fused conv/bias/activation operation : y = Act( alpha1 * conv(x) + alpha2 * z + bias )

func (*ConvolutionD) BiasActivationForwardUS

func (c *ConvolutionD) BiasActivationForwardUS(
	handle *Handle,
	alpha1 float64,
	xD *TensorD, x unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	algo ConvFwdAlgo,
	wspace unsafe.Pointer, wspacesize uint,
	alpha2 float64,
	zD *TensorD, z unsafe.Pointer,
	biasD *TensorD, bias unsafe.Pointer,
	aD *ActivationD,
	yD *TensorD, y unsafe.Pointer,
) error

BiasActivationForwardUS is like BiasActivationForward but using unsafe.Pointer instead of cutil.Mem

func (*ConvolutionD) Destroy

func (c *ConvolutionD) Destroy() error

Destroy destroys the ConvolutionDescriptor. If GC is set then it only returns nil. Currently GC is set with no option to turn off

func (*ConvolutionD) FindBackwardDataAlgorithm

func (c *ConvolutionD) FindBackwardDataAlgorithm(
	handle *Handle,
	w *FilterD,
	dy *TensorD,
	dx *TensorD,
) ([]ConvBwdDataAlgoPerformance, error)

FindBackwardDataAlgorithm will find the top performing algoriths and return the best algorithms in accending order.

func (*ConvolutionD) FindBackwardDataAlgorithmEx

func (c *ConvolutionD) FindBackwardDataAlgorithmEx(
	handle *Handle,
	wD *FilterD, w cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	dxD *TensorD, dx cutil.Mem,
	wspace cutil.Mem, wspacesize uint) ([]ConvBwdDataAlgoPerformance, error)

FindBackwardDataAlgorithmEx finds some algorithms with memory

func (*ConvolutionD) FindBackwardDataAlgorithmExUS

func (c *ConvolutionD) FindBackwardDataAlgorithmExUS(
	handle *Handle,
	wD *FilterD, w unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	dxD *TensorD, dx unsafe.Pointer,
	wspace unsafe.Pointer, wspacesize uint) ([]ConvBwdDataAlgoPerformance, error)

FindBackwardDataAlgorithmExUS is just like FindBackwardDataAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*ConvolutionD) FindBackwardFilterAlgorithm

func (c *ConvolutionD) FindBackwardFilterAlgorithm(
	handle *Handle,
	xD *TensorD,
	dyD *TensorD,
	dwD *FilterD,
) ([]ConvBwdFiltAlgoPerformance, error)

FindBackwardFilterAlgorithm will find the top performing algoriths and return the best algorithms in accending order they are limited to the number passed in requestedAlgoCount. So if 4 is passed through in requestedAlgoCount, then it will return the top 4 performers in the ConvolutionFwdAlgoPerformance struct. using this could possible give the user cheat level performance :-)

func (*ConvolutionD) FindBackwardFilterAlgorithmEx

func (c *ConvolutionD) FindBackwardFilterAlgorithmEx(
	handle *Handle,
	xD *TensorD, x cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	dwD *FilterD, dw cutil.Mem,
	wspace cutil.Mem, wspacesize uint) ([]ConvBwdFiltAlgoPerformance, error)

FindBackwardFilterAlgorithmEx finds some algorithms with memory

func (*ConvolutionD) FindBackwardFilterAlgorithmExUS

func (c *ConvolutionD) FindBackwardFilterAlgorithmExUS(
	handle *Handle,
	xD *TensorD, x unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	dwD *FilterD, dw unsafe.Pointer,
	wspace unsafe.Pointer, wspacesize uint) ([]ConvBwdFiltAlgoPerformance, error)

FindBackwardFilterAlgorithmExUS is just like FindBackwardFilterAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*ConvolutionD) FindForwardAlgorithm

func (c *ConvolutionD) FindForwardAlgorithm(
	handle *Handle,
	xD *TensorD,
	wD *FilterD,
	yD *TensorD,
) ([]ConvFwdAlgoPerformance, error)

FindForwardAlgorithm will find the top performing algoriths and return the best algorithms in accending order they are limited to the number passed in requestedAlgoCount. So if 4 is passed through in requestedAlgoCount, then it will return the top 4 performers in the ConvFwdAlgoPerformance struct. using this could possible give the user cheat level performance :-)

func (*ConvolutionD) FindForwardAlgorithmEx

func (c *ConvolutionD) FindForwardAlgorithmEx(
	handle *Handle,
	xD *TensorD, x cutil.Mem,
	wD *FilterD, w cutil.Mem,
	yD *TensorD, y cutil.Mem,
	wspace cutil.Mem, wspacesize uint) ([]ConvFwdAlgoPerformance, error)

FindForwardAlgorithmEx finds some algorithms with memory

func (*ConvolutionD) FindForwardAlgorithmExUS

func (c *ConvolutionD) FindForwardAlgorithmExUS(
	handle *Handle,
	xD *TensorD,
	x unsafe.Pointer,
	wD *FilterD,
	w unsafe.Pointer,
	yD *TensorD,
	y unsafe.Pointer,
	wspace unsafe.Pointer,
	wspacesize uint) ([]ConvFwdAlgoPerformance, error)

FindForwardAlgorithmExUS is like FindForwardAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*ConvolutionD) Forward

func (c *ConvolutionD) Forward(
	handle *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	wD *FilterD, w cutil.Mem,
	algo ConvFwdAlgo,
	wspace cutil.Mem, wspacesize uint,
	beta float64,
	yD *TensorD, y cutil.Mem) error

Forward Function to perform the forward pass for batch convolution

func (*ConvolutionD) ForwardUS

func (c *ConvolutionD) ForwardUS(
	handle *Handle,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	algo ConvFwdAlgo,
	wspace unsafe.Pointer, wspacesize uint,
	beta float64,
	yD *TensorD, y unsafe.Pointer) error

ForwardUS is like Forward but using unsafe.Pointer instead of cutil.Mem

func (*ConvolutionD) Get

func (c *ConvolutionD) Get() (mode ConvolutionMode, data DataType, pad []int32, stride []int32, dilation []int32, err error)

Get gets returns the values used to make the convolution descriptor

func (*ConvolutionD) GetBackwardDataAlgorithm

func (c *ConvolutionD) GetBackwardDataAlgorithm(
	handle *Handle,
	wD *FilterD,
	dyD *TensorD,
	dxD *TensorD,
	pref ConvBwdDataPref, wspaceSIBlimit uint) (ConvBwdDataAlgo, error)

GetBackwardDataAlgorithm - This function serves as a heuristic for obtaining the best suited algorithm for (*ConvolutionD)BackwardData() for the given layer specifications. Based on the input preference, this function will either return the fastest algorithm or the fastest algorithm within a given memory limit. For an exhaustive search for the fastest algorithm, please use (*ConvolutionD)FindBackwardDataAlgorithm().

Parameters:

----
handle(input):
Handle to a previously created cuDNN context.
----
---
wD(input):
Handle to a previously initialized filter descriptor
---
----
dyD(input):
Handle to the previously initialized input differential tensor descriptor.
----
---
dxD(input):
Handle to the previously initialized output tensor descriptor.
---
----
pref(input):
Enumerant to express the preference criteria in terms of memory requirement and speed.
----
---
wspaceSIBlimit(input):
It is to specify the maximum amount of GPU memory the user is willing to use as a workspace.
This is currently a placeholder and is not used
---
----
returns:
ConvBwdDataAlgo and error.
----

Possible Error Returns:

nil:

The function launched successfully.

CUDNN_STATUS_BAD_PARAM:

At least one of these conditions are met:
1) The numbers of feature maps of the input tensor and output tensor differ.
2) The DataType of the tensor descriptors or the filter are different.

func (*ConvolutionD) GetBackwardDataAlgorithmV7

func (c *ConvolutionD) GetBackwardDataAlgorithmV7(
	handle *Handle,
	wD *FilterD,
	dyD *TensorD,
	dxD *TensorD,
) ([]ConvBwdDataAlgoPerformance, error)

GetBackwardDataAlgorithmV7 - This function serves as a heuristic for obtaining the best suited algorithm for cudnnConvolutionBackwardData for the given layer specifications. This function will return all algorithms (including (MathType where available) sorted by expected (based on internal heuristic) relative performance with fastest being index 0 of perfResults. For an exhaustive search for the fastest algorithm, please use (*ConvolutionD)FindBackwardDataAlgorithm().

func (*ConvolutionD) GetBackwardDataWorkspaceSize

func (c *ConvolutionD) GetBackwardDataWorkspaceSize(
	handle *Handle,
	wD *FilterD,
	dyD *TensorD,
	dxD *TensorD,
	algo ConvBwdDataAlgo) (uint, error)

GetBackwardDataWorkspaceSize is a helper function that will return the minimum Size of the workspace to be passed by the convolution given an algo.

func (*ConvolutionD) GetBackwardFilterAlgorithm

func (c *ConvolutionD) GetBackwardFilterAlgorithm(
	handle *Handle,
	xD *TensorD,
	dyD *TensorD,
	dwD *FilterD,
	pref ConvBwdFilterPref, wsmemlimit uint) (ConvBwdFiltAlgo, error)

GetBackwardFilterAlgorithm gives a good algo with the limits given to it

func (*ConvolutionD) GetBackwardFilterAlgorithmV7

func (c *ConvolutionD) GetBackwardFilterAlgorithmV7(
	handle *Handle,
	xD *TensorD,
	dyD *TensorD,
	dwD *FilterD,
) ([]ConvBwdFiltAlgoPerformance, error)

GetBackwardFilterAlgorithmV7 will find the top performing algoriths and return the best algorithms in accending order they are limited to the number passed in requestedAlgoCount. So if 4 is passed through in requestedAlgoCount, then it will return the top 4 performers in the ConvolutionFwdAlgoPerformance struct. using this could possible give the user cheat level performance :-)

func (*ConvolutionD) GetBackwardFilterWorkspaceSize

func (c *ConvolutionD) GetBackwardFilterWorkspaceSize(
	handle *Handle,
	xD *TensorD,
	dyD *TensorD,
	dwD *FilterD,
	algo ConvBwdFiltAlgo) (uint, error)

GetBackwardFilterWorkspaceSize is a helper function that will return the minimum Size of the workspace to be passed by the convolution given an algo.

func (*ConvolutionD) GetForwardAlgorithm

func (c *ConvolutionD) GetForwardAlgorithm(
	handle *Handle,
	xD *TensorD,
	wD *FilterD,
	yD *TensorD,
	pref ConvolutionForwardPref,
	wsmemlimit uint) (ConvFwdAlgo, error)

GetForwardAlgorithm gives a good algo with the limits given to it

func (*ConvolutionD) GetForwardAlgorithmV7

func (c *ConvolutionD) GetForwardAlgorithmV7(
	handle *Handle,
	xD *TensorD,
	wD *FilterD,
	yD *TensorD,
) ([]ConvFwdAlgoPerformance, error)

GetForwardAlgorithmV7 will find the top performing algoriths and return the best algorithms in accending order they are limited to the number passed in requestedAlgoCount. So if 4 is passed through in requestedAlgoCount, then it will return the top 4 performers in the ConvFwdAlgoPerformance struct. using this could possible give the user cheat level performance :-)

func (*ConvolutionD) GetForwardWorkspaceSize

func (c *ConvolutionD) GetForwardWorkspaceSize(
	handle *Handle,
	xD *TensorD,
	wD *FilterD,
	yD *TensorD,
	algo ConvFwdAlgo) (uint, error)

GetForwardWorkspaceSize is a helper function that will return the minimum Size of the workspace to be passed by the convolution given an algo.

func (*ConvolutionD) GetOutputDims

func (c *ConvolutionD) GetOutputDims(input *TensorD, filter *FilterD) ([]int32, error)

GetOutputDims is a helper function to give the size of the output of of a COnvolutionNDForward Each dimension of the (nbDims-2)-D images of the output tensor is computed as followed:

   outputDim = 1 + ( inputDim + 2*pad - (((filterDim-1)*dilation)+1) )/convolutionStride;

	Note if input and filter are NHWC.  cudnn would take the formats as NCHW and output an NCHW
 gocudnn will take that NCHW and format it to an actual NHWC.

func (*ConvolutionD) GetReorderType

func (c *ConvolutionD) GetReorderType() (r Reorder, err error)

GetReorderType gets the reorder type

func (*ConvolutionD) Im2Col

func (c *ConvolutionD) Im2Col(
	handle *Handle,
	xD *TensorD,
	x cutil.Mem,
	wD *FilterD,
	buffer cutil.Mem,
) error

Im2Col transformes the multiDim tensors into 2d tensors for speed up in calculation at the cost of memory.

func (*ConvolutionD) Im2ColUS

func (c *ConvolutionD) Im2ColUS(
	handle *Handle,
	xD *TensorD, x unsafe.Pointer,
	wD *FilterD,
	buffer unsafe.Pointer,
) error

Im2ColUS is like IN2Col but using unsafe.Pointer instead of cutil.Mem

func (*ConvolutionD) Set

func (c *ConvolutionD) Set(mode ConvolutionMode, data DataType, pad, stride, dilation []int32) error

Set sets the convolution descriptor Input.Type of the filter layout format. If this input is set to CUDNN_TENSOR_NCHW, which is one of the enumerated values allowed by cudnnTensorFormat_t descriptor, then the layout of the filter is as follows:

	For N=4, i.e., for a 4D filter descriptor, the filter layout is in the form of KCRS (K represents the number of output feature maps, C the number of input feature maps, R the number of rows per filter, and S the number of columns per filter.)

	For N=3, i.e., for a 3D filter descriptor, the number S (number of columns per filter) is omitted.

	For N=5 and greater, the layout of the higher dimensions immediately follow RS.

	On the other hand, if this input is set to CUDNN_TENSOR_NHWC, then the layout of the filter is as follows:

	for N=4, i.e., for a 4D filter descriptor, the filter layout is in the form of KRSC.

	For N=3, i.e., for a 3D filter descriptor, the number S (number of columns per filter) is omitted, and the layout of C immediately follows R.

	For N=5 and greater, the layout of the higher dimensions are inserted between S and C. See also the description for cudnnTensorFormat_t.

	Note:

 Length of stride, pad, and dilation need to be len(tensordims) -2.

func (*ConvolutionD) SetGroupCount

func (c *ConvolutionD) SetGroupCount(groupCount int32) error

SetGroupCount sets the Group Count

func (*ConvolutionD) SetMathType

func (c *ConvolutionD) SetMathType(mathtype MathType) error

SetMathType sets the mathtype

func (*ConvolutionD) SetReorderType

func (c *ConvolutionD) SetReorderType(r Reorder) error

SetReorderType sets the reorder type

func (*ConvolutionD) String

func (c *ConvolutionD) String() string

type ConvolutionForwardPref

type ConvolutionForwardPref C.cudnnConvolutionFwdPreference_t

ConvolutionForwardPref used for flags exposing them through methods

func (*ConvolutionForwardPref) NoWorkSpace

NoWorkSpace sets c to ConvolutionForwardPref( C.CUDNN_CONVOLUTION_FWD_NO_WORKSPACE) and returns value of c

func (*ConvolutionForwardPref) PreferFastest

PreferFastest returns ConvolutionForwardPref( C.CUDNN_CONVOLUTION_FWD_PREFER_FASTEST)

func (*ConvolutionForwardPref) SpecifyWorkSpaceLimit

func (c *ConvolutionForwardPref) SpecifyWorkSpaceLimit() ConvolutionForwardPref

SpecifyWorkSpaceLimit returns ConvolutionForwardPref( C.CUDNN_CONVOLUTION_FWD_SPECIFY_WORKSPACE_LIMIT)

type ConvolutionMode

type ConvolutionMode C.cudnnConvolutionMode_t

ConvolutionMode is the type to describe the convolution mode flags

func (*ConvolutionMode) Convolution

func (c *ConvolutionMode) Convolution() ConvolutionMode

Convolution sets and returns value of c to ConvolutionMode(C.CUDNN_CONVOLUTION)

func (*ConvolutionMode) CrossCorrelation

func (c *ConvolutionMode) CrossCorrelation() ConvolutionMode

CrossCorrelation n sets and returns value of c to ConvolutionMode(C.CUDNN_CROSS_CORRELATION)

func (ConvolutionMode) String

func (c ConvolutionMode) String() string

type DataType

type DataType C.cudnnDataType_t

DataType is used for flags for the tensor layer structs

func (*DataType) Double

func (d *DataType) Double() DataType

Double sets d to DataType(C.CUDNN_DATA_DOUBLE) and returns the changed value

func (*DataType) Float

func (d *DataType) Float() DataType

Float sets d to DataType(C.CUDNN_DATA_FLOAT) and returns the changed value

func (*DataType) Half

func (d *DataType) Half() DataType

Half sets d to DataType(C.CUDNN_DATA_HALF) and returns the changed value

func (*DataType) Int32

func (d *DataType) Int32() DataType

Int32 sets d to DataType(C.CUDNN_DATA_INT32) and returns the changed value

func (*DataType) Int8

func (d *DataType) Int8() DataType

Int8 sets d to DataType(C.CUDNN_DATA_INT8) and returns the changed value

func (*DataType) Int8x32

func (d *DataType) Int8x32() DataType

Int8x32 sets d to DataType(C.CUDNN_DATA_INT8x32) and returns the changed value -- only supported by sm_72?.

func (*DataType) Int8x4

func (d *DataType) Int8x4() DataType

Int8x4 sets d to DataType(C.CUDNN_DATA_INT8x4) and returns the changed value -- only supported by sm_72?.

func (DataType) String

func (d DataType) String() string

ToString will return a human readable string that can be printed for debugging.

func (*DataType) UInt8

func (d *DataType) UInt8() DataType

UInt8 sets d to DataType(C.CUDNN_DATA_INT8) and returns the changed value

func (*DataType) UInt8x4

func (d *DataType) UInt8x4() DataType

UInt8x4 sets d to DataType(C.CUDNN_DATA_UINT8x4) and returns the changed value -- only supported by sm_72?.

type DeConvBwdDataAlgo

type DeConvBwdDataAlgo C.cudnnConvolutionFwdAlgo_t

DeConvBwdDataAlgo flags for cudnnConvFwdAlgo_t exposing them through methods. Deconvolution uses the forward pass for backward data

func (DeConvBwdDataAlgo) Algo

func (c DeConvBwdDataAlgo) Algo() Algorithm

Algo returns an Algorithm struct

func (*DeConvBwdDataAlgo) Count

Count sets c to DeConvBwdDataAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_COUNT) and returns value of c

func (*DeConvBwdDataAlgo) Direct

Direct sets c to DeConvBwdDataAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_DIRECT) and returns value of c

func (*DeConvBwdDataAlgo) FFT

FFT sets c to DeConvBwdDataAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_FFT) and returns value of c

func (*DeConvBwdDataAlgo) FFTTiling

func (c *DeConvBwdDataAlgo) FFTTiling() DeConvBwdDataAlgo

FFTTiling sets c to DeConvBwdDataAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_FFT_TILING) and returns value of c

func (*DeConvBwdDataAlgo) Gemm

Gemm sets c to DeConvBwdDataAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_GEMM) and returns value of c

func (*DeConvBwdDataAlgo) ImplicitGemm

func (c *DeConvBwdDataAlgo) ImplicitGemm() DeConvBwdDataAlgo

ImplicitGemm sets c to DeConvBwdDataAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM) and returns value of c

func (*DeConvBwdDataAlgo) ImplicitPrecompGemm

func (c *DeConvBwdDataAlgo) ImplicitPrecompGemm() DeConvBwdDataAlgo

ImplicitPrecompGemm sets c to DeConvBwdDataAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM) and returns value of c

func (DeConvBwdDataAlgo) String

func (c DeConvBwdDataAlgo) String() string

func (*DeConvBwdDataAlgo) WinoGrad

func (c *DeConvBwdDataAlgo) WinoGrad() DeConvBwdDataAlgo

WinoGrad sets c to DeConvBwdDataAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD) and returns value of c

func (*DeConvBwdDataAlgo) WinoGradNonFused

func (c *DeConvBwdDataAlgo) WinoGradNonFused() DeConvBwdDataAlgo

WinoGradNonFused sets c to DeConvBwdDataAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD_NONFUSED) and returns value of c

type DeConvBwdDataAlgoPerformance

type DeConvBwdDataAlgoPerformance struct {
	Algo        DeConvBwdDataAlgo `json:"algo,omitempty"`
	Status      Status            `json:"status,omitempty"`
	Time        float32           `json:"time,omitempty"`
	Memory      uint              `json:"memory,omitempty"`
	Determinism Determinism       `json:"determinism,omitempty"`
	MathType    MathType          `json:"math_type,omitempty"`
}

DeConvBwdDataAlgoPerformance is a new stuct that is made for deconvolution performance

func (DeConvBwdDataAlgoPerformance) String

type DeConvBwdDataPref

type DeConvBwdDataPref C.cudnnConvolutionFwdPreference_t

DeConvBwdDataPref used for flags on bwddatapref exposing them through methods

func (*DeConvBwdDataPref) NoWorkSpace

func (c *DeConvBwdDataPref) NoWorkSpace() DeConvBwdDataPref

NoWorkSpace sets c to returns ConvBwdDataPref( C.CUDNN_CONVOLUTION_FWD_NO_WORKSPACE) and returns value of c

func (*DeConvBwdDataPref) PreferFastest

func (c *DeConvBwdDataPref) PreferFastest() DeConvBwdDataPref

PreferFastest sets c to ConvBwdDataPref( C.CUDNN_CONVOLUTION_FWD_PREFER_FASTEST) and returns value of c

func (*DeConvBwdDataPref) SpecifyWorkSpaceLimit

func (c *DeConvBwdDataPref) SpecifyWorkSpaceLimit() DeConvBwdDataPref

SpecifyWorkSpaceLimit sets c to ConvBwdDataPref( C.CUDNN_CONVOLUTION_FWD_SPECIFY_WORKSPACE_LIMIT)and returns value of c

type DeConvBwdFiltAlgo

type DeConvBwdFiltAlgo C.cudnnConvolutionBwdFilterAlgo_t

DeConvBwdFiltAlgo Used for ConvBwdFiltAlgo flags exposing them through methods

func (DeConvBwdFiltAlgo) Algo

func (c DeConvBwdFiltAlgo) Algo() Algorithm

Algo returns an Algorithm Struct

func (*DeConvBwdFiltAlgo) Algo0

Algo0 sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_0) and returns value of c /* non-deterministic */

func (*DeConvBwdFiltAlgo) Algo1

Algo1 sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_1) and returns value of c

func (*DeConvBwdFiltAlgo) Algo3

Algo3 sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_3) and returns value of c

func (*DeConvBwdFiltAlgo) Count

Count sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_COUNT) and returns value of c

func (*DeConvBwdFiltAlgo) FFT

FFT sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_FFT) and returns value of c

func (*DeConvBwdFiltAlgo) FFTTiling

func (c *DeConvBwdFiltAlgo) FFTTiling() DeConvBwdFiltAlgo

FFTTiling sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_FFT_TILING) and returns value of c

func (DeConvBwdFiltAlgo) String

func (c DeConvBwdFiltAlgo) String() string

func (*DeConvBwdFiltAlgo) Winograd

func (c *DeConvBwdFiltAlgo) Winograd() DeConvBwdFiltAlgo

Winograd sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_WINOGRAD) and returns value of c

func (*DeConvBwdFiltAlgo) WinogradNonFused

func (c *DeConvBwdFiltAlgo) WinogradNonFused() DeConvBwdFiltAlgo

WinogradNonFused sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_WINOGRAD_NONFUSED) and returns value of c

type DeConvBwdFiltAlgoPerformance

type DeConvBwdFiltAlgoPerformance struct {
	Algo        DeConvBwdFiltAlgo `json:"algo,omitempty"`
	Status      Status            `json:"status,omitempty"`
	Time        float32           `json:"time,omitempty"`
	Memory      uint              `json:"memory,omitempty"`
	Determinism Determinism       `json:"determinism,omitempty"`
	MathType    MathType          `json:"math_type,omitempty"`
}

DeConvBwdFiltAlgoPerformance is the return struct in the finding algorithm funcs

func (DeConvBwdFiltAlgoPerformance) String

type DeConvBwdFilterPref

type DeConvBwdFilterPref C.cudnnConvolutionBwdFilterPreference_t

DeConvBwdFilterPref are used for flags for the backwds filters exposing them through methods

func (*DeConvBwdFilterPref) NoWorkSpace

func (c *DeConvBwdFilterPref) NoWorkSpace() DeConvBwdFilterPref

NoWorkSpace sets c to DeConvBwdFilterPref( C.CUDNN_CONVOLUTION_BWD_FILTER_NO_WORKSPACE) and returns value of c

func (*DeConvBwdFilterPref) PreferFastest

func (c *DeConvBwdFilterPref) PreferFastest() DeConvBwdFilterPref

PreferFastest sets c to DeConvBwdFilterPref( C.CUDNN_CONVOLUTION_BWD_FILTER_PREFER_FASTEST) and returns value of c

func (*DeConvBwdFilterPref) SpecifyWorkSpaceLimit

func (c *DeConvBwdFilterPref) SpecifyWorkSpaceLimit() DeConvBwdFilterPref

SpecifyWorkSpaceLimit sets c to DeConvBwdFilterPref( C.CUDNN_CONVOLUTION_BWD_FILTER_SPECIFY_WORKSPACE_LIMIT) and returns value of c

type DeConvFwdAlgo

type DeConvFwdAlgo C.cudnnConvolutionBwdDataAlgo_t

DeConvFwdAlgo used for flags in the forward data algorithms exposing them through methods DeConvolution does the Backward Data pass as its forward.

func (DeConvFwdAlgo) Algo

func (c DeConvFwdAlgo) Algo() Algorithm

Algo returns an Algorithm Struct

func (*DeConvFwdAlgo) Algo0

func (c *DeConvFwdAlgo) Algo0() DeConvFwdAlgo

Algo0 sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_0) and returns value of c /* non-deterministic */

func (*DeConvFwdAlgo) Algo1

func (c *DeConvFwdAlgo) Algo1() DeConvFwdAlgo

Algo1 sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_1) and returns value of c

func (*DeConvFwdAlgo) Count

func (c *DeConvFwdAlgo) Count() DeConvFwdAlgo

Count sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_COUNT) and returns value of c

func (*DeConvFwdAlgo) FFT

func (c *DeConvFwdAlgo) FFT() DeConvFwdAlgo

FFT sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_FFT) and returns value of c

func (*DeConvFwdAlgo) FFTTiling

func (c *DeConvFwdAlgo) FFTTiling() DeConvFwdAlgo

FFTTiling sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_FFT_TILING) and returns value of c

func (DeConvFwdAlgo) String

func (c DeConvFwdAlgo) String() string

func (*DeConvFwdAlgo) Winograd

func (c *DeConvFwdAlgo) Winograd() DeConvFwdAlgo

Winograd sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_WINOGRAD) and returns value of c

func (*DeConvFwdAlgo) WinogradNonFused

func (c *DeConvFwdAlgo) WinogradNonFused() DeConvFwdAlgo

WinogradNonFused sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_WINOGRAD_NONFUSED) and returns value of c

type DeConvFwdAlgoPerformance

type DeConvFwdAlgoPerformance struct {
	Algo        DeConvFwdAlgo `json:"algo,omitempty"`
	Status      Status        `json:"status,omitempty"`
	Time        float32       `json:"time,omitempty"`
	Memory      uint          `json:"memory,omitempty"`
	Determinism Determinism   `json:"determinism,omitempty"`
	MathType    MathType      `json:"math_type,omitempty"`
}

DeConvFwdAlgoPerformance is a struct that holds the performance of the algorithm

func (DeConvFwdAlgoPerformance) String

func (cb DeConvFwdAlgoPerformance) String() string

type DeConvolutionD

type DeConvolutionD struct {
	// contains filtered or unexported fields
}

DeConvolutionD sets all the convolution info

func CreateDeConvolutionDescriptor

func CreateDeConvolutionDescriptor() (*DeConvolutionD, error)

CreateDeConvolutionDescriptor creates a deconvolution descriptor

func (*DeConvolutionD) BackwardBias

func (c *DeConvolutionD) BackwardBias(
	handle *Handle,
	alpha float64,
	dyD *TensorD,
	dy cutil.Mem,
	beta float64,
	dbD *TensorD,
	db cutil.Mem) error

BackwardBias is used to compute the bias gradient for batch convolution db is returned

func (*DeConvolutionD) BackwardBiasUS

func (c *DeConvolutionD) BackwardBiasUS(
	handle *Handle,
	alpha float64,
	dyD *TensorD, dy unsafe.Pointer,
	beta float64,
	dbD *TensorD, db unsafe.Pointer) error

BackwardBiasUS is like BackwardBias but using unsafe.Pointer instead of cutil.Mem

func (*DeConvolutionD) BackwardData

func (c *DeConvolutionD) BackwardData(
	handle *Handle,
	alpha float64,
	wD *FilterD, w cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	algo DeConvBwdDataAlgo,
	wspace cutil.Mem, wspaceSIB uint,
	beta float64,
	dxD *TensorD, dx cutil.Mem) error

BackwardData Function to perform the backward pass pass for batch convolution

func (*DeConvolutionD) BackwardDataUS

func (c *DeConvolutionD) BackwardDataUS(
	handle *Handle,
	alpha float64,
	wD *FilterD, w unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	algo DeConvBwdDataAlgo,
	wspace unsafe.Pointer, wspacesize uint,
	beta float64,
	dxD *TensorD, dx unsafe.Pointer) error

BackwardDataUS is like BackwardData but using unsafe.Pointer instead of cutil.Mem

func (*DeConvolutionD) BackwardFilter

func (c *DeConvolutionD) BackwardFilter(
	handle *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	algo DeConvBwdFiltAlgo,
	wspace cutil.Mem, wspacesize uint,
	beta float64,
	dwD *FilterD, dw cutil.Mem,
) error

BackwardFilter does the backwards deconvolution filter

func (*DeConvolutionD) BackwardFilterUS

func (c *DeConvolutionD) BackwardFilterUS(
	handle *Handle,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	algo DeConvBwdFiltAlgo,
	wspace unsafe.Pointer, wspacesize uint,
	beta float64,
	dwD *FilterD, dw unsafe.Pointer,
) error

BackwardFilterUS is like BackwardFilter but using unsafe.Pointer instead of cutil.Mem

func (*DeConvolutionD) Destroy

func (c *DeConvolutionD) Destroy() error

Destroy destroys the ConvolutionDescriptor. If GC is set then it only returns nil. Currently GC is set with no option to turn off

func (*DeConvolutionD) FindBackwardDataAlgorithm

func (c *DeConvolutionD) FindBackwardDataAlgorithm(
	handle *Handle,
	w *FilterD,
	dy *TensorD,
	dx *TensorD,
) ([]DeConvBwdDataAlgoPerformance, error)

FindBackwardDataAlgorithm will find the top performing algoriths and return the best algorithms in accending order.

func (*DeConvolutionD) FindBackwardDataAlgorithmEx

func (c *DeConvolutionD) FindBackwardDataAlgorithmEx(
	handle *Handle,
	wD *FilterD, w cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	dxD *TensorD, dx cutil.Mem,
	wspace cutil.Mem, wspacesize uint) ([]DeConvBwdDataAlgoPerformance, error)

FindBackwardDataAlgorithmEx finds some algorithms with memory

func (*DeConvolutionD) FindBackwardDataAlgorithmExUS

func (c *DeConvolutionD) FindBackwardDataAlgorithmExUS(
	handle *Handle,
	wD *FilterD, w unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	dxD *TensorD, dx unsafe.Pointer,
	wspace unsafe.Pointer, wspacesize uint) ([]DeConvBwdDataAlgoPerformance, error)

FindBackwardDataAlgorithmExUS is just like FindBackwardDataAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*DeConvolutionD) FindBackwardFilterAlgorithm

func (c *DeConvolutionD) FindBackwardFilterAlgorithm(
	handle *Handle,
	xD *TensorD,
	dyD *TensorD,
	dwD *FilterD,
) ([]DeConvBwdFiltAlgoPerformance, error)

FindBackwardFilterAlgorithm will find the top performing algoriths and return the best algorithms in accending order they are limited to the number passed in requestedAlgoCount. So if 4 is passed through in requestedAlgoCount, then it will return the top 4 performers in the ConvolutionFwdAlgoPerformance struct. using this could possible give the user cheat level performance :-)

func (*DeConvolutionD) FindBackwardFilterAlgorithmEx

func (c *DeConvolutionD) FindBackwardFilterAlgorithmEx(
	handle *Handle,
	xD *TensorD, x cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	dwD *FilterD, dw cutil.Mem,
	wspace cutil.Mem, wspacesize uint) ([]DeConvBwdFiltAlgoPerformance, error)

FindBackwardFilterAlgorithmEx finds some algorithms with memory

func (*DeConvolutionD) FindBackwardFilterAlgorithmExUS

func (c *DeConvolutionD) FindBackwardFilterAlgorithmExUS(
	handle *Handle,
	xD *TensorD, x unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	dwD *FilterD, dw unsafe.Pointer,
	wspace unsafe.Pointer, wspacesize uint) ([]DeConvBwdFiltAlgoPerformance, error)

FindBackwardFilterAlgorithmExUS is just like FindBackwardFilterAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*DeConvolutionD) FindForwardAlgorithm

func (c *DeConvolutionD) FindForwardAlgorithm(
	handle *Handle,
	xD *TensorD,
	wD *FilterD,
	yD *TensorD,
) ([]DeConvFwdAlgoPerformance, error)

FindForwardAlgorithm will find the top performing algoriths and return the best algorithms in accending order they are limited to the number passed in requestedAlgoCount. So if 4 is passed through in requestedAlgoCount, then it will return the top 4 performers in the ConvFwdAlgoPerformance struct. using this could possible give the user cheat level performance :-)

func (*DeConvolutionD) FindForwardAlgorithmEx

func (c *DeConvolutionD) FindForwardAlgorithmEx(
	handle *Handle,
	xD *TensorD,
	x cutil.Mem,
	wD *FilterD,
	w cutil.Mem,
	yD *TensorD,
	y cutil.Mem,
	wspace cutil.Mem,
	wspaceSIBlimit uint) ([]DeConvFwdAlgoPerformance, error)

FindForwardAlgorithmEx finds some algorithms with memory

func (*DeConvolutionD) FindForwardAlgorithmExUS

func (c *DeConvolutionD) FindForwardAlgorithmExUS(
	handle *Handle,
	xD *TensorD,
	x unsafe.Pointer,
	wD *FilterD,
	w unsafe.Pointer,
	yD *TensorD,
	y unsafe.Pointer,
	wspace unsafe.Pointer,
	wspaceSIBlimit uint) ([]DeConvFwdAlgoPerformance, error)

FindForwardAlgorithmExUS is like FindForwardAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*DeConvolutionD) Forward

func (c *DeConvolutionD) Forward(
	handle *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	wD *FilterD, w cutil.Mem,
	algo DeConvFwdAlgo,
	wspace cutil.Mem, wspaceSIB uint,
	beta float64,
	yD *TensorD, y cutil.Mem) error

Forward does the forward deconvolution

This function computes the convolution data gradient of the tensor dy, where y is the output of the forward convolution in (*ConvolutionD)Forward(). It uses the specified algo, and returns the results in the output tensor dx. Scaling factors alpha and beta can be used to scale the computed result or accumulate with the current dx.

func (*DeConvolutionD) ForwardUS

func (c *DeConvolutionD) ForwardUS(
	handle *Handle,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	algo DeConvFwdAlgo,
	wspace unsafe.Pointer, wspacesize uint,
	beta float64,
	yD *TensorD, y unsafe.Pointer) error

ForwardUS is like BackwardData but uses unsafe.Pointer instead of cutil.Mem

func (*DeConvolutionD) Get

func (c *DeConvolutionD) Get() (mode ConvolutionMode, data DataType, pad []int32, stride []int32, dilation []int32, err error)

Get gets returns the values used to make the convolution descriptor

func (*DeConvolutionD) GetBackwardDataAlgorithm

func (c *DeConvolutionD) GetBackwardDataAlgorithm(
	handle *Handle,
	wD *FilterD,
	dyD *TensorD,
	dxD *TensorD,
	pref DeConvBwdDataPref, wspaceSIBlimit uint) (DeConvBwdDataAlgo, error)

GetBackwardDataAlgorithm gets the fastest backwards data algorithm with parameters that are passed.

func (*DeConvolutionD) GetBackwardDataAlgorithmV7

func (c *DeConvolutionD) GetBackwardDataAlgorithmV7(
	handle *Handle,
	wD *FilterD,
	dyD *TensorD,
	dxD *TensorD,
) ([]DeConvBwdDataAlgoPerformance, error)

GetBackwardDataAlgorithmV7 - This function serves as a heuristic for obtaining the best suited algorithm for the given layer specifications. This function will return all algorithms (including (MathType where available) sorted by expected (based on internal heuristic) relative performance with fastest being index 0 of perfResults.

func (*DeConvolutionD) GetBackwardDataWorkspaceSize

func (c *DeConvolutionD) GetBackwardDataWorkspaceSize(
	handle *Handle,
	wD *FilterD,
	dyD *TensorD,
	dxD *TensorD,
	algo DeConvBwdDataAlgo) (uint, error)

GetBackwardDataWorkspaceSize is a helper function that will return the minimum Size of the workspace to be passed by the convolution given an algo.

func (*DeConvolutionD) GetBackwardFilterAlgorithm

func (c *DeConvolutionD) GetBackwardFilterAlgorithm(
	handle *Handle,
	xD *TensorD,
	dyD *TensorD,
	dwD *FilterD,
	pref DeConvBwdFilterPref, wsmemlimit uint) (DeConvBwdFiltAlgo, error)

GetBackwardFilterAlgorithm gives a good algo with the limits given to it

func (*DeConvolutionD) GetBackwardFilterAlgorithmV7

func (c *DeConvolutionD) GetBackwardFilterAlgorithmV7(
	handle *Handle,
	xD *TensorD,
	dyD *TensorD,
	dwD *FilterD,
) ([]DeConvBwdFiltAlgoPerformance, error)

GetBackwardFilterAlgorithmV7 will find the top performing algoriths and return the best algorithms in accending order they are limited to the number passed in requestedAlgoCount. So if 4 is passed through in requestedAlgoCount, then it will return the top 4 performers in the ConvolutionFwdAlgoPerformance struct. using this could possible give the user cheat level performance :-)

func (*DeConvolutionD) GetBackwardFilterWorkspaceSize

func (c *DeConvolutionD) GetBackwardFilterWorkspaceSize(
	handle *Handle,
	xD *TensorD,
	dyD *TensorD,
	dwD *FilterD,
	algo DeConvBwdFiltAlgo) (uint, error)

GetBackwardFilterWorkspaceSize is a helper function that will return the minimum Size of the workspace to be passed by the convolution given an algo.

func (*DeConvolutionD) GetBiasDims

func (c *DeConvolutionD) GetBiasDims(w *FilterD) ([]int32, error)

GetBiasDims will return bias dims for the deconvolution. Ony supports NCHW and NHWC formats

func (*DeConvolutionD) GetForwardAlgorithm

func (c *DeConvolutionD) GetForwardAlgorithm(
	handle *Handle,
	xD *TensorD,
	wD *FilterD,
	yD *TensorD,
	pref DeConvolutionForwardPref,
	wspaceSIBlimit uint) (DeConvFwdAlgo, error)

GetForwardAlgorithm gives a good algo with the limits given to it

func (*DeConvolutionD) GetForwardAlgorithmV7

func (c *DeConvolutionD) GetForwardAlgorithmV7(
	handle *Handle,
	xD *TensorD,
	wD *FilterD,
	yD *TensorD,
) ([]DeConvFwdAlgoPerformance, error)

GetForwardAlgorithmV7 will find the top performing algoriths and return the best algorithms in accending order they are limited to the number passed in requestedAlgoCount. So if 4 is passed through in requestedAlgoCount, then it will return the top 4 performers in the ConvFwdAlgoPerformance struct. using this could possible give the user cheat level performance :-)

func (*DeConvolutionD) GetForwardWorkspaceSize

func (c *DeConvolutionD) GetForwardWorkspaceSize(
	handle *Handle,
	xD *TensorD,
	wD *FilterD,
	yD *TensorD,
	algo DeConvFwdAlgo) (uint, error)

GetForwardWorkspaceSize is a helper function that will return the minimum Size of the workspace to be passed by the convolution given an algo.

func (*DeConvolutionD) GetOutputDims

func (c *DeConvolutionD) GetOutputDims(input *TensorD, filter *FilterD) ([]int32, error)

GetOutputDims is a helper function to give the size of the output of of a DeConvolutionNDForward Each dimension of the (nbDims-2)-D images of the output tensor is computed as followed:

outputDim = (inputDim-1)*convolutionStride -2*pad + (((filterDim-1)*dilation)+1)

DeConvolution works differently than a convolution.

In a normal convolution, the output channel will be the number of neurons it has. The channel size of the nuerons will be the input channel size.

For a deconvolution. The number of neurons will be the input channel size, and the neuron channel size will be the output channel size.

func (*DeConvolutionD) GetReorderType

func (c *DeConvolutionD) GetReorderType() (r Reorder, err error)

GetReorderType gets the reorder type

func (*DeConvolutionD) Set

func (c *DeConvolutionD) Set(mode ConvolutionMode, data DataType, pad, stride, dilation []int32) error

Set sets the convolution descriptor Input.Type of the filter layout format. If this input is set to CUDNN_TENSOR_NCHW, which is one of the enumerated values allowed by cudnnTensorFormat_t descriptor, then the layout of the filter is as follows:

	For N=4, i.e., for a 4D filter descriptor, the filter layout is in the form of KCRS (K represents the number of output feature maps, C the number of input feature maps, R the number of rows per filter, and S the number of columns per filter.)

	For N=3, i.e., for a 3D filter descriptor, the number S (number of columns per filter) is omitted.

	For N=5 and greater, the layout of the higher dimensions immediately follow RS.

	On the other hand, if this input is set to CUDNN_TENSOR_NHWC, then the layout of the filter is as follows:

	for N=4, i.e., for a 4D filter descriptor, the filter layout is in the form of KRSC.

	For N=3, i.e., for a 3D filter descriptor, the number S (number of columns per filter) is omitted, and the layout of C immediately follows R.

	For N=5 and greater, the layout of the higher dimensions are inserted between S and C. See also the description for cudnnTensorFormat_t.

	Note:

 Length of stride, pad, and dilation need to be len(tensordims) -2.

func (*DeConvolutionD) SetGroupCount

func (c *DeConvolutionD) SetGroupCount(groupCount int32) error

SetGroupCount sets the Group Count

func (*DeConvolutionD) SetMathType

func (c *DeConvolutionD) SetMathType(mathtype MathType) error

SetMathType sets the mathtype

func (*DeConvolutionD) SetReorderType

func (c *DeConvolutionD) SetReorderType(r Reorder) error

SetReorderType sets the reorder type

func (*DeConvolutionD) String

func (c *DeConvolutionD) String() string

String satisfies fmt Stringer interface.

type DeConvolutionForwardPref

type DeConvolutionForwardPref C.cudnnConvolutionBwdDataPreference_t

DeConvolutionForwardPref used for flags on deconvolution forward exposing them through methods

func (*DeConvolutionForwardPref) NoWorkSpace

NoWorkSpace sets c to returns DeConvolutionForwardPref( C.CUDNN_CONVOLUTION_FWD_NO_WORKSPACE) and returns value of c

func (*DeConvolutionForwardPref) PreferFastest

PreferFastest sets c to DeConvolutionForwardPref( C.CUDNN_CONVOLUTION_FWD_PREFER_FASTEST) and returns value of c

func (*DeConvolutionForwardPref) SpecifyWorkSpaceLimit

func (c *DeConvolutionForwardPref) SpecifyWorkSpaceLimit() DeConvolutionForwardPref

SpecifyWorkSpaceLimit sets c to ConvBwdDataPref( C.CUDNN_CONVOLUTION_FWD_SPECIFY_WORKSPACE_LIMIT)and returns value of c

type Debug

type Debug C.cudnnDebug_t

Debug is Debug type

func (*Debug) String

func (d *Debug) String() string

type Determinism

type Determinism C.cudnnDeterminism_t

Determinism is the type for flags that set Determinism and are called and changed through type's methods

func (*Determinism) Deterministic

func (d *Determinism) Deterministic() Determinism

Deterministic sets d to Determinism(C.CUDNN_DETERMINISTIC) and returns the value

func (*Determinism) Non

func (d *Determinism) Non() Determinism

Non returns sets d to Determinism(C.CUDNN_NON_DETERMINISTIC) and returns the value

func (Determinism) String

func (d Determinism) String() string

String outputs a string of the type

type DirectionMode

type DirectionMode C.cudnnDirectionMode_t

DirectionMode is used for flags and exposes flags of type through types methods

func (*DirectionMode) Bi

func (r *DirectionMode) Bi() DirectionMode

Bi sets r to and returns DirectionMode(C.CUDNN_BIDIRECTIONAL)

func (DirectionMode) String

func (r DirectionMode) String() string

func (*DirectionMode) Uni

func (r *DirectionMode) Uni() DirectionMode

Uni sets r to and returns DirectionMode(C.CUDNN_UNIDIRECTIONAL)

type DivNormMode

type DivNormMode C.cudnnDivNormMode_t

DivNormMode is usde for C.cudnnDivNormMode_t flags

func (*DivNormMode) PrecomputedMeans

func (d *DivNormMode) PrecomputedMeans() DivNormMode

PrecomputedMeans sets d to and returns DivNormMode(C.CUDNN_DIVNORM_PRECOMPUTED_MEANS)

func (DivNormMode) String

func (d DivNormMode) String() string

type DropOutD

type DropOutD struct {
	// contains filtered or unexported fields
}

DropOutD holds the dropout descriptor

func CreateDropOutDescriptor

func CreateDropOutDescriptor() (*DropOutD, error)

CreateDropOutDescriptor creates a drop out descriptor to be set

func (*DropOutD) Backward

func (d *DropOutD) Backward(
	handle *Handle,
	dyD *TensorD,
	dy cutil.Mem,
	dxD *TensorD,
	dx cutil.Mem,
	reserveSpace cutil.Mem,
	reservesize uint,
) error

Backward performs the dropoutForward

Input/Output: dx,reserveSpace

func (*DropOutD) BackwardUS

func (d *DropOutD) BackwardUS(
	handle *Handle,
	dyD *TensorD,
	dy unsafe.Pointer,
	dxD *TensorD,
	dx unsafe.Pointer,
	reserveSpace unsafe.Pointer,
	reservesize uint,
) error

BackwardUS is like Backward but uses unsafe.Pointer instead of cutil.Mem

func (*DropOutD) Destroy

func (d *DropOutD) Destroy() error

Destroy destroys the dropout descriptor unless the the finalizer flag was set.

func (*DropOutD) Forward

func (d *DropOutD) Forward(
	handle *Handle,
	xD *TensorD,
	x cutil.Mem,
	yD *TensorD,
	y cutil.Mem,
	reserveSpace cutil.Mem,
	reservesize uint,
) error

Forward performs the dropoutForward

Input/Output: y,reserveSpace

func (*DropOutD) ForwardUS

func (d *DropOutD) ForwardUS(
	handle *Handle,
	xD *TensorD, x unsafe.Pointer,
	yD *TensorD, y unsafe.Pointer,
	reserveSpace unsafe.Pointer, reservesize uint,
) error

ForwardUS is like Forward but uses unsafe.Pointer instead of cutil.Mem

func (*DropOutD) Get

func (d *DropOutD) Get(
	handle *Handle,
) (float32, cutil.Mem, uint64, error)

Get gets the descriptor to a previously saved-off state

func (*DropOutD) GetReserveSpaceSize

func (d *DropOutD) GetReserveSpaceSize(t *TensorD) (uint, error)

GetReserveSpaceSize returns the size of reserve space in bytes. Method calls a function that doesn't use the DropOutD, but function is releveant to the DropOut operation

func (*DropOutD) GetStateSize

func (d *DropOutD) GetStateSize(handle *Handle) (uint, error)

GetStateSize returns the state size in bytes Method calls a function that doesn't use DropOutD, but it is a dropout type function, and is used to get the size the cutil.Mem, or unsafe.Pointer needs to for state.

func (*DropOutD) GetUS

func (d *DropOutD) GetUS(handle *Handle) (float32, unsafe.Pointer, uint64, error)

GetUS is like GetUS but uses unsafe.Pointer instead of cutil.Mem

func (*DropOutD) Restore

func (d *DropOutD) Restore(
	handle *Handle,
	dropout float32,
	states cutil.Mem,
	bytes uint,
	seed uint64,
) error

Restore restores the descriptor to a previously saved-off state

func (*DropOutD) RestoreUS

func (d *DropOutD) RestoreUS(
	handle *Handle,
	dropout float32,
	states unsafe.Pointer,
	bytes uint,
	seed uint64,
) error

RestoreUS is like Restore but uses unsafe.Pointer instead of cutil.Mem

func (*DropOutD) Set

func (d *DropOutD) Set(handle *Handle, dropout float32, states cutil.Mem, bytes uint, seed uint64) error

Set sets the drop out descriptor

func (*DropOutD) SetUS

func (d *DropOutD) SetUS(handle *Handle, dropout float32, states unsafe.Pointer, bytes uint, seed uint64) error

SetUS is like Set but uses unsafe.Pointer instead of cutil.Mem

type ErrQueryMode

type ErrQueryMode C.cudnnErrQueryMode_t

ErrQueryMode are basically flags that are used for different modes that are exposed through the types methods

func (*ErrQueryMode) Blocking

func (e *ErrQueryMode) Blocking() ErrQueryMode

Blocking sets e to and returns ErrQueryMode(C.CUDNN_ERRQUERY_BLOCKING)

func (*ErrQueryMode) NonBlocking

func (e *ErrQueryMode) NonBlocking() ErrQueryMode

NonBlocking sets e to and returns ErrQueryMode(C.CUDNN_ERRQUERY_NONBLOCKING)

func (*ErrQueryMode) RawCode

func (e *ErrQueryMode) RawCode() ErrQueryMode

RawCode sets e to and returns ErrQueryMode(C.CUDNN_ERRQUERY_RAWCODE)

type FilterD

type FilterD struct {
	// contains filtered or unexported fields
}

FilterD is the struct holding discriptor information for cudnnFilterDescriptor_t

func CreateFilterDescriptor

func CreateFilterDescriptor() (*FilterD, error)

CreateFilterDescriptor creates a filter distriptor

func (*FilterD) Destroy

func (f *FilterD) Destroy() error

Destroy Destroys Filter Descriptor if GC is not set. if GC is set then it won't do anything

func (*FilterD) Get

func (f *FilterD) Get() (dtype DataType, frmt TensorFormat, shape []int32, err error)

Get returns a copy of the ConvolutionD

func (*FilterD) GetSizeInBytes

func (f *FilterD) GetSizeInBytes() (uint, error)

GetSizeInBytes returns the size in bytes for the filter

func (*FilterD) ReorderFilterBias

func (f *FilterD) ReorderFilterBias(h *Handle,
	r Reorder,
	filtersrc, reorderfilterdest cutil.Mem,
	reorderbias bool,
	biassrc, reorderbiasdest cutil.Mem) error

ReorderFilterBias -reorders the filter and bias values. It can be used to enhance the inference time by separating the reordering operation from convolution.

For example, convolutions in a neural network of multiple layers can require reordering of kernels at every layer, which can take up a significant fraction of the total inference time. Using this function, the reordering can be done one time on the filter and bias data followed by the convolution operations at the multiple layers, thereby enhancing the inference time.

func (*FilterD) Set

func (f *FilterD) Set(dtype DataType, format TensorFormat, shape []int32) error

Set sets the filter descriptor Like with TensorD the shape put in will be not like cudnn. cudnn will always take the shape NCHW and switch the dims and change the tensor stride in order to change the format. gocudnn will change the dims to a format that cudnn likes. if the format is nhwc.

Basic 4D filter

The Basic NCHW shape is shape[0] = # of output feature maps

     	   shape[1] = # of input feature maps
	       shape[2] = height of each filter
	       shape[3] = width of each input filter

The Basic NHWC shape is shape[0] = # of output feature maps

						   shape[1] = height of each filter
						   shape[2] = width of each input filter
				     	   shape[3] = # of input feature maps

 Basic ND filter

The Basic NCHW shape is shape[0] = # of output feature maps

     	   shape[1]   = # of input feature maps
	       shape[.]   = feature dims
	       shape[N-1] = feature dims

The Basic NHWC shape is shape[0] = # of output feature maps

		   shape[.]   = feature dims
     	   shape[N-1] = # of input feature maps

func (*FilterD) String

func (f *FilterD) String() string

type FoldingDirection

type FoldingDirection C.cudnnFoldingDirection_t

FoldingDirection is used as a flag for TransformDescriptor which are revealed through type's methods.

func (*FoldingDirection) Fold

Fold sets variable to Fold and returns Fold value

func (FoldingDirection) String

func (f FoldingDirection) String() string

String satisfies the stringer interface

func (*FoldingDirection) UnFold

func (f *FoldingDirection) UnFold() FoldingDirection

UnFold sets variable to UnFold and returns UnFold value

type Handle

type Handle struct {
	// contains filtered or unexported fields
}

Handle is a struct containing a cudnnHandle_t which is basically a Pointer to a CUContext

func CreateHandle

func CreateHandle(usegogc bool) *Handle

CreateHandle creates a handle its basically a Context usegogc is for future use. Right now it is always on the gc.

This function initializes the cuDNN library and creates a handle to an opaque structure holding the cuDNN library context. It allocates hardware resources on the host and device and must be called prior to making any other cuDNN library calls.

The cuDNN library handle is tied to the current CUDA device (context). To use the library on multiple devices, one cuDNN handle needs to be created for each device.

For a given device, multiple cuDNN handles with different configurations (e.g., different current CUDA streams) may be created. Because cudnnCreate allocates some internal resources, the release of those resources by calling cudnnDestroy will implicitly call cudaDeviceSynchronize; therefore, the recommended best practice is to call cudnnCreate/cudnnDestroy outside of performance-critical code paths.

For multithreaded applications that use the same device from different threads, the recommended programming model is to create one (or a few, as is convenient) cuDNN handle(s) per thread and use that cuDNN handle for the entire life of the thread.

func CreateHandleEX

func CreateHandleEX(w *gocu.Worker, usegogc bool) *Handle

CreateHandleEX creates a handle like CreateHandle, but gocudnn functions that pass the handle will pass the operations to the worker. if w is nil the handle will function just like a handle created with CreateHandle()

func (*Handle) Destroy

func (handle *Handle) Destroy() error

Destroy destroys the handle if GC is being use it won't do anything.

func (*Handle) GetStream

func (handle *Handle) GetStream() (gocu.Streamer, error)

GetStream will return a stream that the handle is using

func (*Handle) Pointer

func (handle *Handle) Pointer() unsafe.Pointer

Pointer is a pointer to the handle

func (*Handle) QueryRuntimeError

func (handle *Handle) QueryRuntimeError(mode ErrQueryMode, tag *RuntimeTag) (Status, error)

QueryRuntimeError check cudnnQueryRuntimeError in DEEP Learning SDK Documentation tag should be nil

func (*Handle) SetStream

func (handle *Handle) SetStream(s gocu.Streamer) error

SetStream passes a stream to sent in the cuda handle

type IndiciesType

type IndiciesType C.cudnnIndicesType_t

IndiciesType are flags

func (IndiciesType) String

func (i IndiciesType) String() string

String satisfies stringer interface

func (*IndiciesType) Type16Bit

func (i *IndiciesType) Type16Bit() IndiciesType

Type16Bit sets i to and returns IndiciesType( C.CUDNN_16BIT_INDICES) flag

func (*IndiciesType) Type32Bit

func (i *IndiciesType) Type32Bit() IndiciesType

Type32Bit sets i to and returns IndiciesType( C.CUDNN_32BIT_INDICES) flag

func (*IndiciesType) Type64Bit

func (i *IndiciesType) Type64Bit() IndiciesType

Type64Bit sets i to and returns IndiciesType( C.CUDNN_64BIT_INDICES) flag

func (*IndiciesType) Type8Bit

func (i *IndiciesType) Type8Bit() IndiciesType

Type8Bit sets i to and returns IndiciesType( C.CUDNN_8BIT_INDICES) flag

type LRND

type LRND struct {
	// contains filtered or unexported fields
}

LRND holds the LRN Descriptor

func CreateLRNDescriptor

func CreateLRNDescriptor() (*LRND, error)

CreateLRNDescriptor creates an RND descriptor

func (*LRND) Destroy

func (l *LRND) Destroy() error

Destroy destroys the descriptor if not using gc it will just return nil if not on. Currently gc is always on

func (*LRND) DivisiveNormalizationBackward

func (l *LRND) DivisiveNormalizationBackward(
	handle *Handle,
	mode DivNormMode,
	alpha float64,
	xD *TensorD, x, means, dy, temp, temp2 cutil.Mem,
	beta float64,
	dXdMeansDesc *TensorD, dx, dMeans cutil.Mem,
) error

DivisiveNormalizationBackward LRN cross-channel backward computation. Double parameters cast to tensor data type

func (*LRND) DivisiveNormalizationBackwardUS

func (l *LRND) DivisiveNormalizationBackwardUS(
	handle *Handle,
	mode DivNormMode,
	alpha float64,
	xD *TensorD, x, means, dy, temp, temp2 unsafe.Pointer,
	beta float64,
	dXdMeansDesc *TensorD, dx, dMeans unsafe.Pointer,
) error

DivisiveNormalizationBackwardUS is like DivisiveNormalizationBackward but using unsafe.Pointer instead of cutil.Mem

func (*LRND) DivisiveNormalizationForward

func (l *LRND) DivisiveNormalizationForward(
	handle *Handle,
	mode DivNormMode,
	alpha float64,
	xD TensorD, x, means, temp, temp2 cutil.Mem,
	beta float64,
	yD TensorD, y cutil.Mem,
) error

DivisiveNormalizationForward LCN/divisive normalization functions: y = alpha * normalize(x) + beta * y

func (*LRND) DivisiveNormalizationForwardUS

func (l *LRND) DivisiveNormalizationForwardUS(
	handle *Handle,
	mode DivNormMode,
	alpha float64,
	xD TensorD, x, means, temp, temp2 unsafe.Pointer,
	beta float64,
	yD TensorD, y unsafe.Pointer,
) error

DivisiveNormalizationForwardUS is like DivisiveNormalizationForward but using unsafe.Pointer instead of cutil.Mem

func (*LRND) Get

func (l *LRND) Get() (lrnN uint32, lrnAlpha float64, lrnBeta float64, lrnK float64, err error)

Get returns the descriptor values that were set with set

func (*LRND) LRNCrossChannelBackward

func (l *LRND) LRNCrossChannelBackward(
	handle *Handle,
	mode LRNmode,
	alpha float64,
	yD *TensorD, y cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	xD *TensorD, x cutil.Mem,
	beta float64,
	dxD *TensorD, dx cutil.Mem,
) error

LRNCrossChannelBackward LRN cross-channel backward computation. Double parameters cast to tensor data type

func (*LRND) LRNCrossChannelBackwardUS

func (l *LRND) LRNCrossChannelBackwardUS(
	handle *Handle,
	mode LRNmode,
	alpha float64,
	yD *TensorD, y unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	xD *TensorD, x unsafe.Pointer,
	beta float64,
	dxD *TensorD, dx unsafe.Pointer,
) error

LRNCrossChannelBackwardUS is like LRNCrossChannelBackward but using unsafe.Pointer instead of cutil.Mem

func (*LRND) LRNCrossChannelForward

func (l *LRND) LRNCrossChannelForward(
	handle *Handle,
	mode LRNmode,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	beta float64,
	yD *TensorD, y cutil.Mem,
) error

LRNCrossChannelForward LRN cross-channel forward computation. Double parameters cast to tensor data type

func (*LRND) LRNCrossChannelForwardUS

func (l *LRND) LRNCrossChannelForwardUS(
	handle *Handle,
	mode LRNmode,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	beta float64,
	yD *TensorD, y unsafe.Pointer,
) error

LRNCrossChannelForwardUS is like LRNCrossChannelForward but using unsafe.Pointer instead of cutil.Mem

func (LRND) MaxN

func (l LRND) MaxN() uint32

MaxN returns the constant lrnmaxN

func (LRND) MinBeta

func (l LRND) MinBeta() float64

MinBeta returns lrnminBeta constant

func (LRND) MinK

func (l LRND) MinK() float64

MinK returns lrnminK constant

func (LRND) MinN

func (l LRND) MinN() uint32

MinN returns the constant lrminN

func (*LRND) Set

func (l *LRND) Set(lrnN uint32,
	lrnAlpha,
	lrnBeta,
	lrnK float64) error

Set sets the LRND

func (*LRND) String

func (l *LRND) String() string

type LRNmode

type LRNmode C.cudnnLRNMode_t

LRNmode is used for the flags in LRNmode

func (*LRNmode) CrossChanelDim1

func (l *LRNmode) CrossChanelDim1() LRNmode

CrossChanelDim1 sets l to and returns LRNmode( C.CUDNN_LRN_CROSS_CHANNEL_DIM1)

func (LRNmode) String

func (l LRNmode) String() string

type MathType

type MathType C.cudnnMathType_t

MathType are flags to set for cudnnMathType_t and can be called by types methods

func (*MathType) AllowConversion

func (m *MathType) AllowConversion() MathType

AllowConversion return MathType(C.CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION)

func (*MathType) Default

func (m *MathType) Default() MathType

Default sets m to MathType(C.CUDNN_DEFAULT_MATH) and returns changed value

func (MathType) String

func (m MathType) String() string

String satisfies the stringer interface

func (*MathType) TensorOpMath

func (m *MathType) TensorOpMath() MathType

TensorOpMath return MathType(C.CUDNN_TENSOR_OP_MATH)

type MultiHeadAttnWeightKind

type MultiHeadAttnWeightKind C.cudnnMultiHeadAttnWeightKind_t

MultiHeadAttnWeightKind is a flag for the kind of weights used flags are exposed through type's methods.

func (*MultiHeadAttnWeightKind) Keys

Keys - sets value to MultiHeadAttnWeightKind(C.CUDNN_MH_ATTN_K_WEIGHTS) and returns that value. From cudnn.h -input projection weights for 'keys'

func (*MultiHeadAttnWeightKind) Output

Output - sets value to MultiHeadAttnWeightKind(C.CUDNN_MH_ATTN_O_WEIGHTS) and returns that value. From cudnn.h - output projection weights

func (*MultiHeadAttnWeightKind) Queries

Queries - sets value to MultiHeadAttnWeightKind(C.CUDNN_MH_ATTN_Q_WEIGHTS) and returns that value. From cudnn.h - input projection weights for 'queries'

func (MultiHeadAttnWeightKind) String

func (m MultiHeadAttnWeightKind) String() string

func (*MultiHeadAttnWeightKind) Values

Values - sets value to MultiHeadAttnWeightKind(C.CUDNN_MH_ATTN_V_WEIGHTS) and returns that value. From cudnn.h - input projection weights for 'values'

type NANProp

type NANProp C.cudnnNanPropagation_t

NANProp is type for C.cudnnNanPropagation_t used for flags and are called and changed through type's methods

func (*NANProp) NotPropigate

func (p *NANProp) NotPropigate() NANProp

NotPropigate sets p to PropagationNAN(C.CUDNN_NOT_PROPAGATE_NAN) and returns that value

func (*NANProp) Propigate

func (p *NANProp) Propigate() NANProp

Propigate sets p to PropagationNAN(C.CUDNN_PROPAGATE_NAN) and returns that value

func (NANProp) String

func (p NANProp) String() string

String satisfies stringer interface.

type OPTensorD

type OPTensorD struct {
	// contains filtered or unexported fields
}

OPTensorD holds OP Tensor information

func CreateOpTensorDescriptor

func CreateOpTensorDescriptor() (*OPTensorD, error)

CreateOpTensorDescriptor creates and sets an OpTensor

func (*OPTensorD) Destroy

func (t *OPTensorD) Destroy() error

Destroy destroys the descriptor

func (*OPTensorD) Get

func (t *OPTensorD) Get() (op OpTensorOp, dtype DataType, nan NANProp, err error)

Get returns the descriptor information with error

func (*OPTensorD) OpTensor

func (t *OPTensorD) OpTensor(
	handle *Handle,
	alpha1 float64,
	aD *TensorD, A cutil.Mem,
	alpha2 float64,
	bD *TensorD, B cutil.Mem,
	beta float64,
	cD *TensorD, cmem cutil.Mem) error

OpTensor performs an operation on some tensors C= operation( (alpha1 * A) , (alpha2 *B) ) + (beta *C)

func (*OPTensorD) OpTensorUS

func (t *OPTensorD) OpTensorUS(
	handle *Handle,
	alpha1 float64,
	aD *TensorD, A unsafe.Pointer,
	alpha2 float64,
	bD *TensorD, B unsafe.Pointer,
	beta float64,
	cD *TensorD, cmem unsafe.Pointer) error

OpTensorUS is like OpTensor but uses unsafe.Pointer instead of cutil.Mem

func (*OPTensorD) Set

func (t *OPTensorD) Set(op OpTensorOp, dtype DataType, nan NANProp) error

Set sets the OPTensorD.

func (*OPTensorD) String

func (t *OPTensorD) String() string

type OpTensorOp

type OpTensorOp C.cudnnOpTensorOp_t

OpTensorOp is used for flags for the Optensor functions

func (*OpTensorOp) Add

func (o *OpTensorOp) Add() OpTensorOp

Add sets o to OpTensorOp(C.CUDNN_OP_TENSOR_ADD) and returns the new value

func (*OpTensorOp) Max

func (o *OpTensorOp) Max() OpTensorOp

Max sets o to OpTensorOp(C.CUDNN_OP_TENSOR_MAX) and returns the new value

func (*OpTensorOp) Min

func (o *OpTensorOp) Min() OpTensorOp

Min sets o to OpTensorOp(C.CUDNN_OP_TENSOR_MIN) and returns the new value

func (*OpTensorOp) Mul

func (o *OpTensorOp) Mul() OpTensorOp

Mul sets o to OpTensorOp(C.CUDNN_OP_TENSOR_MUL) and returns the new value

func (*OpTensorOp) Not

func (o *OpTensorOp) Not() OpTensorOp

Not returns OpTensorOp(C.CUDNN_OP_TENSOR_NOT) and returns the new value

func (*OpTensorOp) Sqrt

func (o *OpTensorOp) Sqrt() OpTensorOp

Sqrt sets o to OpTensorOp(C.CUDNN_OP_TENSOR_SQRT) and returns the new value

func (OpTensorOp) String

func (o OpTensorOp) String() string

type PersistentRNNPlan

type PersistentRNNPlan struct {
	// contains filtered or unexported fields
}

PersistentRNNPlan holds C.cudnnPersistentRNNPlan_t

func (*PersistentRNNPlan) DestroyPersistentRNNPlan

func (p *PersistentRNNPlan) DestroyPersistentRNNPlan() error

DestroyPersistentRNNPlan destroys the C.cudnnPersistentRNNPlan_t in the PersistentRNNPlan struct

type PoolingD

type PoolingD struct {
	// contains filtered or unexported fields
}

PoolingD handles the pooling descriptor

func CreatePoolingDescriptor

func CreatePoolingDescriptor() (*PoolingD, error)

CreatePoolingDescriptor creates a pooling descriptor.

func (*PoolingD) Backward

func (p *PoolingD) Backward(
	handle *Handle,
	alpha float64,
	yD *TensorD, y cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	xD *TensorD, x cutil.Mem,
	beta float64,
	dxD *TensorD, dx cutil.Mem,
) error

Backward does the backward pooling operation

func (*PoolingD) BackwardUS

func (p *PoolingD) BackwardUS(
	handle *Handle,
	alpha float64,
	yD *TensorD, y unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	xD *TensorD, x unsafe.Pointer,
	beta float64,
	dxD *TensorD, dx unsafe.Pointer,
) error

BackwardUS is like Backward but uses unsafe.Pointer instead of cutil.Mem

func (*PoolingD) Destroy

func (p *PoolingD) Destroy() error

Destroy destroys the pooling descriptor.

Right now gocudnn is handle by the go GC exclusivly, but sometime in the future user of package will be be able to toggle it.

func (*PoolingD) Forward

func (p *PoolingD) Forward(
	handle *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	beta float64,
	yD *TensorD, y cutil.Mem,
) error

Forward does the poolingForward operation

func (*PoolingD) ForwardUS

func (p *PoolingD) ForwardUS(
	handle *Handle,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	beta float64,
	yD *TensorD, y unsafe.Pointer,
) error

ForwardUS is like Forward but uses unsafe.Pointer instead of cutil.Mem

func (*PoolingD) Get

func (p *PoolingD) Get() (mode PoolingMode, nan NANProp, window, padding, stride []int32, err error)

Get gets the descriptor values for pooling

func (*PoolingD) GetOutputDims

func (p *PoolingD) GetOutputDims(
	input *TensorD,
) ([]int32, error)

GetOutputDims will return the forward output dims from the pooling desc, and the tensor passed For NHWC gocudnn will take the cudnn dims (which are in NCHW) and convert it to NHWC.

func (*PoolingD) Set

func (p *PoolingD) Set(mode PoolingMode, nan NANProp, window, padding, stride []int32) error

Set sets pooling descriptor to values passed

func (*PoolingD) String

func (p *PoolingD) String() string

type PoolingMode

type PoolingMode C.cudnnPoolingMode_t

PoolingMode is used for flags in pooling

func (*PoolingMode) AverageCountExcludePadding

func (p *PoolingMode) AverageCountExcludePadding() PoolingMode

AverageCountExcludePadding returns PoolingMode(C.CUDNN_POOLING_AVERAGE_COUNT_EXCLUDE_PADDING) flag

Values inside the pooling window are averaged. The number of elements used to calculate the average excludes spatial locations falling in the padding region.

func (*PoolingMode) AverageCountIncludePadding

func (p *PoolingMode) AverageCountIncludePadding() PoolingMode

AverageCountIncludePadding returns PoolingMode(C.CUDNN_POOLING_AVERAGE_COUNT_INCLUDE_PADDING) flag

Values inside the pooling window are averaged. The number of elements used to calculate the average includes spatial locations falling in the padding region.

func (*PoolingMode) Max

func (p *PoolingMode) Max() PoolingMode

Max returns PoolingMode(C.CUDNN_POOLING_MAX) flag

The maximum value inside the pooling window is used.

func (*PoolingMode) MaxDeterministic

func (p *PoolingMode) MaxDeterministic() PoolingMode

MaxDeterministic returns PoolingMode(C.CUDNN_POOLING_MAX_DETERMINISTIC) flag

The maximum value inside the pooling window is used. The algorithm used is deterministic.

func (PoolingMode) String

func (p PoolingMode) String() string

type RNNAlgo

type RNNAlgo C.cudnnRNNAlgo_t

RNNAlgo s used for flags and exposes the different flags through its methods

func (RNNAlgo) Algo

func (r RNNAlgo) Algo() Algorithm

Algo returns an Algorithm used for

func (*RNNAlgo) Count

func (r *RNNAlgo) Count() RNNAlgo

Count sets r to and returns RNNAlgo( C.CUDNN_RNN_ALGO_COUNT) flag

func (*RNNAlgo) PersistDynamic

func (r *RNNAlgo) PersistDynamic() RNNAlgo

PersistDynamic sets r to and returns RNNAlgo( C.CUDNN_RNN_ALGO_PERSIST_DYNAMIC) flag

func (*RNNAlgo) PersistStatic

func (r *RNNAlgo) PersistStatic() RNNAlgo

PersistStatic sets r to and returns RNNAlgo( C.CUDNN_RNN_ALGO_PERSIST_STATIC) flag

func (*RNNAlgo) Standard

func (r *RNNAlgo) Standard() RNNAlgo

Standard sets r to and returns RNNAlgo( C.CUDNN_RNN_ALGO_STANDARD) flag

func (RNNAlgo) String

func (r RNNAlgo) String() string

type RNNBiasMode

type RNNBiasMode C.cudnnRNNBiasMode_t

RNNBiasMode handles bias flags for RNN. Flags are exposed through types methods

func (*RNNBiasMode) Double

func (b *RNNBiasMode) Double() RNNBiasMode

Double sets b to and returns RNNBiasMode(C.CUDNN_RNN_DOUBLE_BIAS)

func (*RNNBiasMode) NoBias

func (b *RNNBiasMode) NoBias() RNNBiasMode

NoBias sets b to and returns RNNBiasMode(C.CUDNN_RNN_NO_BIAS)

func (*RNNBiasMode) SingleINP

func (b *RNNBiasMode) SingleINP() RNNBiasMode

SingleINP sets b to and returns RNNBiasMode(C.CUDNN_RNN_SINGLE_INP_BIAS)

func (*RNNBiasMode) SingleREC

func (b *RNNBiasMode) SingleREC() RNNBiasMode

SingleREC sets b to and returns RNNBiasMode(C.CUDNN_RNN_SINGLE_REC_BIAS)

func (RNNBiasMode) String

func (b RNNBiasMode) String() string

String satisfies the stringer interface

type RNNClipMode

type RNNClipMode C.cudnnRNNClipMode_t

RNNClipMode is a flag for the clipmode for an RNN

func (*RNNClipMode) MinMax

func (r *RNNClipMode) MinMax() RNNClipMode

MinMax sets r to and returns RNNClipMode(C.CUDNN_RNN_CLIP_MINMAX)

func (*RNNClipMode) None

func (r *RNNClipMode) None() RNNClipMode

None sets r to and returns RNNClipMode(C.CUDNN_RNN_CLIP_NONE)

func (RNNClipMode) String

func (r RNNClipMode) String() string

type RNND

type RNND struct {
	// contains filtered or unexported fields
}

RNND holdes Rnn descriptor

func CreateRNNDescriptor

func CreateRNNDescriptor() (desc *RNND, err error)

CreateRNNDescriptor creates an RNND descriptor

func (*RNND) BackwardDataEx

func (r *RNND) BackwardDataEx(h *Handle,
	yD *RNNDataD, y cutil.Mem,
	dyD *RNNDataD, dy cutil.Mem,
	dhyD *TensorD, dhy cutil.Mem,
	dcyD *TensorD, dcy cutil.Mem,
	wD *FilterD, w cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	cxD *TensorD, cx cutil.Mem,
	dxD *RNNDataD, dx cutil.Mem,
	dhxD *TensorD, dhx cutil.Mem,
	dcxD *TensorD, dcx cutil.Mem,
	wspace cutil.Mem, wspacesib uint,
	rspace cutil.Mem, rspacesib uint) error

BackwardDataEx - Taken from cudnn documentation This routine is the extended version of the function cudnnRNNBackwardData. This function cudnnRNNBackwardDataEx allows the user to use unpacked (padded) layout for input y and output dx. In the unpacked layout, each sequence in the mini-batch is considered to be of fixed length, specified by maxSeqLength in its corresponding RNNDataDescriptor. Each fixed-length sequence, for example, the nth sequence in the mini-batch, is composed of a valid segment specified by the seqLengthArray[n] in its corresponding RNNDataDescriptor; and a padding segment to make the combined sequence length equal to maxSeqLength.

With the unpacked layout, both sequence major (i.e. time major) and batch major are supported. For backward compatibility, the packed sequence major layout is supported. However, similar to the non-extended function cudnnRNNBackwardData, the sequences in the mini-batch need to be sorted in descending order according to length.

Parameters:

handle is handle passed to all cudnn funcs. needs to be initialized before using.

yD -Input. A previously initialized RNN data descriptor.

Must match or be the exact same descriptor previously passed into ForwardTrainingEx.

y -Input. Data pointer to the GPU memory associated with the RNN data descriptor yD.

The vectors are expected to be laid out in memory according to the layout specified by yD.
The elements in the tensor (including elements in the padding vector) must be densely packed, and no strides are supported.
Must contain the exact same data previously produced by ForwardTrainingEx.

dyD -Input. A previously initialized RNN data descriptor.

The dataType, layout, maxSeqLength , batchSize, vectorSize and seqLengthArray need to match the yD previously passed to ForwardTrainingEx.

dy -Input.Data pointer to the GPU memory associated with the RNN data descriptor dyD.

The vectors are expected to be laid out in memory according to the layout specified by dyD.
The elements in the tensor (including elements in the padding vector) must be densely packed, and no strides are supported.

dhyD -Input. A fully packed tensor descriptor describing the gradients at the final hidden state of the RNN.

The first dimension of the tensor depends on the direction argument passed to the (*RNND)Set(params) call used to initialize rnnDesc.
Moreover:
If direction is CUDNN_UNIDIRECTIONAL the first dimension should match the numLayers argument passed to ((*RNND)Set(params).)
If direction is CUDNN_BIDIRECTIONAL the first dimension should match double the numLayers argument passed to (*RNND)Set(params).

The second dimension must match the batchSize parameter in xD.

The third dimension depends on whether RNN mode is CUDNN_LSTM and whether LSTM projection is enabled. Moreover:

If RNN mode is CUDNN_LSTM and LSTM projection is enabled, the third dimension must match the recProjSize argument passed to (*RNND)SetProjectionLayers(params) call used to set rnnDesc. Otherwise, the third dimension must match the hiddenSize argument passed to the (*RNND)Set(params) call used to initialize rnnDesc. dhy Input. Data pointer to GPU memory associated with the tensor descriptor dhyD. If a NULL pointer is passed, the gradients at the final hidden state of the network will be initialized to zero.

dcyD - Input. A fully packed tensor descriptor describing the gradients at the final cell state of the RNN. The first dimension of the tensor depends on the direction argument passed to the (*RNND)Set(params) call used to initialize rnnDesc. Moreover:

If direction is CUDNN_UNIDIRECTIONAL the first dimension should match the numLayers argument passed to (*RNND)Set(params).
If direction is CUDNN_BIDIRECTIONAL the first dimension should match double the numLayers argument passed to (*RNND)Set(params).
The second dimension must match the first dimension of the tensors described in xD.

The third dimension must match the hiddenSize argument passed to the (*RNND)Set(params) call used to initialize rnnDesc. The tensor must be fully packed.

dcy - Input. Data pointer to GPU memory associated with the tensor descriptor dcyD. If a NULL pointer is passed, the gradients at the final cell state of the network will be initialized to zero.

wD -Input. Handle to a previously initialized filter descriptor describing the weights for the RNN.

w -Input. Data pointer to GPU memory associated with the filter descriptor wD.

hxD -Input. A fully packed tensor descriptor describing the initial hidden state of the RNN. Must match or be the exact same descriptor previously passed into ForwardTrainingEx.

hx -Input. Data pointer to GPU memory associated with the tensor descriptor hxD. If a NULL pointer is passed, the initial hidden state of the network will be initialized to zero. Must contain the exact same data previously passed into ForwardTrainingEx, or be NULL if NULL was previously passed to ForwardTrainingEx.

cxD - Input. A fully packed tensor descriptor describing the initial cell state for LSTM networks. Must match or be the exact same descriptor previously passed into ForwardTrainingEx.

cx -Input. Data pointer to GPU memory associated with the tensor descriptor cxD. If a NULL pointer is passed, the initial cell state of the network will be initialized to zero. Must contain the exact same data previously passed into ForwardTrainingEx, or be NULL if NULL was previously passed to ForwardTrainingEx.

dxD - Input. A previously initialized RNN data descriptor. The dataType, layout, maxSeqLength, batchSize, vectorSize and seqLengthArray need to match that of xD previously passed to ForwardTrainingEx.

dx -Output. Data pointer to the GPU memory associated with the RNN data descriptor dxD. The vectors are expected to be laid out in memory according to the layout specified by dxD. The elements in the tensor (including elements in the padding vector) must be densely packed, and no strides are supported.

dhxD -Input. A fully packed tensor descriptor describing the gradient at the initial hidden state of the RNN. The descriptor must be set exactly the same way as dhyD.

dhx- Output. Data pointer to GPU memory associated with the tensor descriptor dhxD. If a NULL pointer is passed, the gradient at the hidden input of the network will not be set.

dcxD-Input. A fully packed tensor descriptor describing the gradient at the initial cell state of the RNN. The descriptor must be set exactly the same way as dcyD.

dcx -Output. Data pointer to GPU memory associated with the tensor descriptor dcxD. If a NULL pointer is passed, the gradient at the cell input of the network will not be set.

wspace - Input. Data pointer to GPU memory to be used as a wspace for this call. wspacesib - Input. Specifies the size in bytes of the provided wspace.

rspace - Input/Output. Data pointer to GPU memory to be used as a reserve space for this call. rspacesib - Input. Specifies the size in bytes of the provided rspace.

func (*RNND) BackwardDataExUS

func (r *RNND) BackwardDataExUS(h *Handle,
	yD *RNNDataD, y unsafe.Pointer,
	dyD *RNNDataD, dy unsafe.Pointer,
	dhyD *TensorD, dhy unsafe.Pointer,
	dcyD *TensorD, dcy unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	cxD *TensorD, cx unsafe.Pointer,
	dxD *RNNDataD, dx unsafe.Pointer,
	dhxD *TensorD, dhx unsafe.Pointer,
	dcxD *TensorD, dcx unsafe.Pointer,
	wspace unsafe.Pointer, wspacesib uint,
	rspace unsafe.Pointer, rspacesib uint) error

BackwardDataExUS is like BackwardDataEx but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) BackwardWeights

func (r *RNND) BackwardWeights(
	handle *Handle,
	xD []*TensorD, x cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	yD []*TensorD, y cutil.Mem,
	wspace cutil.Mem, wspacesize uint,
	dwD *FilterD, dw cutil.Mem,
	rspace cutil.Mem, rspacesize uint,
) error

BackwardWeights does the backward weight function

func (*RNND) BackwardWeightsEx

func (r *RNND) BackwardWeightsEx(h *Handle,
	xD *RNNDataD, x cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	yD *RNNDataD, y cutil.Mem,
	wspace cutil.Mem, wspacesib uint,
	dwD *FilterD, dw cutil.Mem,
	rspace cutil.Mem, rspacesib uint,
) error

BackwardWeightsEx -from cudnn documentation This routine is the extended version of the function cudnnRNNBackwardWeights. This function cudnnRNNBackwardWeightsEx allows the user to use unpacked (padded) layout for input x and output dw. In the unpacked layout, each sequence in the mini-batch is considered to be of fixed length, specified by maxSeqLength in its corresponding RNNDataDescriptor. Each fixed-length sequence, for example, the nth sequence in the mini-batch, is composed of a valid segment specified by t he seqLengthArray[n] in its corresponding RNNDataDescriptor; and a padding segment to make the combined sequence length equal to maxSeqLength. With the unpacked layout, both sequence major (i.e. time major) and batch major are supported. For backward compatibility, the packed sequence major layout is supported. However, similar to the non-extended function cudnnRNNBackwardWeights, the sequences in the mini-batch need to be sorted in descending order according to length.

Parameters:

handle - Input. Handle to a previously created cuDNN context.

xD - Input. A previously initialized RNN data descriptor. Must match or

be the exact same descriptor previously passed into ForwardTrainingEx.

x - Input. Data pointer to GPU memory associated with the tensor descriptors

in the array xD. Must contain the exact same data previously passed into ForwardTrainingEx.

hxD - Input. A fully packed tensor descriptor describing the initial hidden state of the RNN.

Must match or be the exact same descriptor previously passed into ForwardTrainingEx.

hx - Input. Data pointer to GPU memory associated with the tensor descriptor hxD.

If a NULL pointer is passed, the initial hidden state of the network will be initialized to zero.
Must contain the exact same data previously passed into ForwardTrainingEx, or be NULL if NULL was previously passed to ForwardTrainingEx.

yD - Input. A previously initialized RNN data descriptor.

Must match or be the exact same descriptor previously passed into ForwardTrainingEx.

y -Input. Data pointer to GPU memory associated with the output tensor descriptor yD.

Must contain the exact same data previously produced by ForwardTrainingEx.

wspace - Input. Data pointer to GPU memory to be used as a wspace for this call.

wspacesib - Input. Specifies the size in bytes of the provided wspace.

dwD- Input. Handle to a previously initialized filter descriptor describing the gradients of the weights for the RNN.

dw - Input/Output. Data pointer to GPU memory associated with the filter descriptor dwD.

rspace - Input. Data pointer to GPU memory to be used as a reserve space for this call.

rspacesib - Input. Specifies the size in bytes of the provided rspace

func (*RNND) BackwardWeightsExUS

func (r *RNND) BackwardWeightsExUS(h *Handle,
	xD *RNNDataD, x unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	yD *RNNDataD, y unsafe.Pointer,
	wspace unsafe.Pointer, wspacesib uint,
	dwD *FilterD, dw unsafe.Pointer,
	rspace unsafe.Pointer, rspacesib uint,
) error

BackwardWeightsExUS is like BackwardWeightsEx but with unsafe.Pointer instead of cutil.Mem

func (*RNND) BackwardWeightsUS

func (r *RNND) BackwardWeightsUS(
	handle *Handle,
	xD []*TensorD, x unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	yD []*TensorD, y unsafe.Pointer,
	wspace unsafe.Pointer, wspacesize uint,
	dwD *FilterD, dw unsafe.Pointer,
	rspace unsafe.Pointer, rspacesize uint,
) error

BackwardWeightsUS is like BackwardWeights but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) Destroy

func (r *RNND) Destroy() error

Destroy destroys the descriptor Right now this doesn't work because gocudnn uses go's GC.

func (*RNND) FindRNNBackwardDataAlgorithmEx

func (r *RNND) FindRNNBackwardDataAlgorithmEx(
	handle *Handle,
	yD []*TensorD, y cutil.Mem,
	dyD []*TensorD, dy cutil.Mem,
	dhyD *TensorD, dhy cutil.Mem,
	dcyD *TensorD, dcy cutil.Mem,
	wD *FilterD, w cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	cxD *TensorD, cx cutil.Mem,
	dxD []*TensorD, dx cutil.Mem,
	dhxD *TensorD, dhx cutil.Mem,
	dcxD *TensorD, dcx cutil.Mem,
	findIntensity float32,
	wspace cutil.Mem, wspacesize uint,
	rspace cutil.Mem, rspacesize uint,

) ([]AlgorithmPerformance, error)

FindRNNBackwardDataAlgorithmEx finds a list of Algorithm for backprop this passes like 26 parameters and pointers and stuff so watch out.

func (*RNND) FindRNNBackwardDataAlgorithmExUS

func (r *RNND) FindRNNBackwardDataAlgorithmExUS(
	handle *Handle,
	yD []*TensorD, y unsafe.Pointer,
	dyD []*TensorD, dy unsafe.Pointer,
	dhyD *TensorD, dhy unsafe.Pointer,
	dcyD *TensorD, dcy unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	cxD *TensorD, cx unsafe.Pointer,
	dxD []*TensorD, dx unsafe.Pointer,
	dhxD *TensorD, dhx unsafe.Pointer,
	dcxD *TensorD, dcx unsafe.Pointer,
	findIntensity float32,
	wspace unsafe.Pointer, wspacesize uint,
	rspace unsafe.Pointer, rspacesize uint,

) ([]AlgorithmPerformance, error)

FindRNNBackwardDataAlgorithmExUS is like FindRNNBackwardDataAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) FindRNNBackwardWeightsAlgorithmEx

func (r *RNND) FindRNNBackwardWeightsAlgorithmEx(
	handle *Handle,
	xD []*TensorD, x cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	yD []*TensorD, y cutil.Mem,
	findIntensity float32,
	wspace cutil.Mem, wspacesize uint,
	dwD *FilterD, dw cutil.Mem,
	rspace cutil.Mem, rspacesize uint,

) ([]AlgorithmPerformance, error)

FindRNNBackwardWeightsAlgorithmEx returns some Algorithm and their performance and stuff

func (*RNND) FindRNNBackwardWeightsAlgorithmExUS

func (r *RNND) FindRNNBackwardWeightsAlgorithmExUS(
	handle *Handle,
	xD []*TensorD, x unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	yD []*TensorD, y unsafe.Pointer,
	findIntensity float32,
	wspace unsafe.Pointer, wspacesize uint,
	dwD *FilterD, dw unsafe.Pointer,
	rspace unsafe.Pointer, rspacesize uint,

) ([]AlgorithmPerformance, error)

FindRNNBackwardWeightsAlgorithmExUS is like FindRNNBackwardWeightsAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) FindRNNForwardInferenceAlgorithmEx

func (r *RNND) FindRNNForwardInferenceAlgorithmEx(
	handle *Handle,
	xD []*TensorD,
	x cutil.Mem,
	hxD *TensorD,
	hx cutil.Mem,
	cxD *TensorD,
	cx cutil.Mem,
	wD *FilterD,
	w cutil.Mem,
	yD []*TensorD,
	y cutil.Mem,
	hyD *TensorD,
	hy cutil.Mem,
	cyD *TensorD,
	cy cutil.Mem,
	findIntensity float32,
	wspace cutil.Mem, wspacesize uint,
) ([]AlgorithmPerformance, error)

FindRNNForwardInferenceAlgorithmEx finds the inference algorithmEx

func (*RNND) FindRNNForwardInferenceAlgorithmExUS

func (r *RNND) FindRNNForwardInferenceAlgorithmExUS(
	handle *Handle,
	xD []*TensorD,
	x unsafe.Pointer,
	hxD *TensorD,
	hx unsafe.Pointer,
	cxD *TensorD,
	cx unsafe.Pointer,
	wD *FilterD,
	w unsafe.Pointer,
	yD []*TensorD,
	y unsafe.Pointer,
	hyD *TensorD,
	hy unsafe.Pointer,
	cyD *TensorD,
	cy unsafe.Pointer,
	findIntensity float32,
	wspace unsafe.Pointer, wspacesize uint,
) ([]AlgorithmPerformance, error)

FindRNNForwardInferenceAlgorithmExUS is like FindRNNForwardInferenceAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) FindRNNForwardTrainingAlgorithmEx

func (r *RNND) FindRNNForwardTrainingAlgorithmEx(
	handle *Handle,
	xD []*TensorD,
	x cutil.Mem,
	hxD *TensorD,
	hx cutil.Mem,
	cxD *TensorD,
	cx cutil.Mem,
	wD *FilterD,
	w cutil.Mem,
	yD []*TensorD,
	y cutil.Mem,
	hyD *TensorD,
	hy cutil.Mem,
	cyD *TensorD,
	cy cutil.Mem,
	findIntensity float32,
	reqAlgocount int32,
	wspace cutil.Mem,
	wspacesize uint,
	rspace cutil.Mem,
	rspacesize uint,

) ([]AlgorithmPerformance, error)

FindRNNForwardTrainingAlgorithmEx finds and orders the performance of rnn Algorithm for training returns that list with an error

func (*RNND) FindRNNForwardTrainingAlgorithmExUS

func (r *RNND) FindRNNForwardTrainingAlgorithmExUS(
	handle *Handle,
	xD []*TensorD, x unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	cxD *TensorD, cx unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	yD []*TensorD, y unsafe.Pointer,
	hyD *TensorD, hy unsafe.Pointer,
	cyD *TensorD, cy unsafe.Pointer,
	findIntensity float32,
	wspace unsafe.Pointer, wspacesize uint,
	rspace unsafe.Pointer, rspacesize uint,

) ([]AlgorithmPerformance, error)

FindRNNForwardTrainingAlgorithmExUS is like FindRNNForwardTrainingAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) ForwardInferenceEx

func (r *RNND) ForwardInferenceEx(
	h *Handle,
	xD *RNNDataD, x cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	cxD *TensorD, cx cutil.Mem,
	wD *FilterD, w cutil.Mem,
	yD *RNNDataD, y cutil.Mem,
	hyD *TensorD, hy cutil.Mem,
	cyD *TensorD, cy cutil.Mem,
	wspace cutil.Mem, wspacesib uint,
) error

ForwardInferenceEx - from cudnn documentation This routine is the extended version of the cudnnRNNForwardInference function. The ForwardTrainingEx allows the user to use unpacked (padded) layout for input x and output y. In the unpacked layout, each sequence in the mini-batch is considered to be of fixed length, specified by maxSeqLength in its corresponding RNNDataDescriptor. Each fixed-length sequence, for example, the nth sequence in the mini-batch, is composed of a valid segment, specified by the seqLengthArray[n] in its corresponding RNNDataDescriptor, and a padding segment to make the combined sequence length equal to maxSeqLength.

With unpacked layout, both sequence major (i.e. time major) and batch major are supported. For backward compatibility, the packed sequence major layout is supported. However, similar to the non-extended function cudnnRNNForwardInference, the sequences in the mini-batch need to be sorted in descending order according to length.

Parameters

handle - Input. Handle to a previously created cuDNN context.

xD- Input. A previously initialized RNN Data descriptor. The dataType, layout, maxSeqLength , batchSize, and seqLengthArray need to match that of yD. x -Input. Data pointer to the GPU memory associated with the RNN data descriptor xD. The vectors are expected to be laid out in memory according to the layout specified by xD.

The elements in the tensor (including elements in the padding vector) must be densely packed, and no strides are supported.

hxD - Input. A fully packed tensor descriptor describing the initial hidden state of the RNN. The first dimension of the tensor depends on the direction argument passed to the cudnnSetRNNDescriptor call used to initialize rnnDesc:

If direction is CUDNN_UNIDIRECTIONAL the first dimension should match the numLayers argument passed to cudnnSetRNNDescriptor.
If direction is CUDNN_BIDIRECTIONAL the first dimension should match double the numLayers argument passed to cudnnSetRNNDescriptor.
The second dimension must match the batchSize parameter described in xD.
The third dimension depends on whether RNN mode is CUDNN_LSTM and whether LSTM projection is enabled. In specific:
If RNN mode is CUDNN_LSTM and LSTM projection is enabled, the third dimension must match the recProjSize argument passed to cudnnSetRNNProjectionLayers call used to set rnnDesc.
Otherwise, the third dimension must match the hiddenSize argument passed to the cudnnSetRNNDescriptor call used to initialize rnnDesc.

hx - Input. Data pointer to GPU memory associated with the tensor descriptor hxD. If a NULL pointer is passed, the initial hidden state of the network will be initialized to zero.

cxD -Input. A fully packed tensor descriptor describing the initial cell state for LSTM networks.

The first dimension of the tensor depends on the direction argument passed to the cudnnSetRNNDescriptor call used to initialize rnnDesc:
If direction is CUDNN_UNIDIRECTIONAL the first dimension should match the numLayers argument passed to cudnnSetRNNDescriptor.
If direction is CUDNN_BIDIRECTIONAL the first dimension should match double the numLayers argument passed to cudnnSetRNNDescriptor.
The second dimension must match the batchSize parameter in xD. The third dimension must match the hiddenSize argument passed to the cudnnSetRNNDescriptor call used to initialize rnnDesc.

cx - Input. Data pointer to GPU memory associated with the tensor descriptor cxD.

If a NULL pointer is passed, the initial cell state of the network will be initialized to zero.

wD - Input. Handle to a previously initialized filter descriptor describing the weights for the RNN.

w - Input. Data pointer to GPU memory associated with the filter descriptor wD.

yD - Input. A previously initialized RNN data descriptor. The dataType, layout, maxSeqLength , batchSize, and seqLengthArray must match that of dyD and dxD.

The parameter vectorSize depends on whether RNN mode is CUDNN_LSTM and whether LSTM projection is enabled and whether the network is bidirectional.
In specific: For uni-directional network, if RNN mode is CUDNN_LSTM and LSTM projection is enabled,
the parameter vectorSize must match the recProjSize argument passed to cudnnSetRNNProjectionLayers call used to set rnnDesc.
If the network is bidirectional, then multiply the value by 2.
Otherwise, for uni-directional network, the parameter vectorSize must match the hiddenSize argument passed
to the cudnnSetRNNDescriptor call used to initialize rnnDesc. If the network is bidirectional, then multiply the value by 2.

y - Output. Data pointer to the GPU memory associated with the RNN data descriptor yD.

The vectors are expected to be laid out in memory according to the layout specified by yD.
The elements in the tensor (including elements in the padding vector) must be densely packed, and no strides are supported.

hyD - Input. A fully packed tensor descriptor describing the final hidden state of the RNN. The descriptor must be set exactly the same way as hxD.

hy - Output. Data pointer to GPU memory associated with the tensor descriptor hyD. If a NULL pointer is passed, the final hidden state of the network will not be saved.

cyD - Input. A fully packed tensor descriptor describing the final cell state for LSTM networks. The descriptor must be set exactly the same way as cxD.

cy -Output. Data pointer to GPU memory associated with the tensor descriptor cyD. If a NULL pointer is passed, the final cell state of the network will be not be saved.

wspace - Input. Data pointer to GPU memory to be used as a wspace for this call.

wspacesib - Input. Specifies the size in bytes of the provided wspace.

func (*RNND) ForwardInferenceExUS

func (r *RNND) ForwardInferenceExUS(
	h *Handle,
	xD *RNNDataD, x unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	cxD *TensorD, cx unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	yD *RNNDataD, y unsafe.Pointer,
	hyD *TensorD, hy unsafe.Pointer,
	cyD *TensorD, cy unsafe.Pointer,
	wspace unsafe.Pointer, wspacesib uint,
) error

ForwardInferenceExUS is like ForwardInferenceEx but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) ForwardTrainingEx

func (r *RNND) ForwardTrainingEx(h *Handle,
	xD *RNNDataD, x cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	cxD *TensorD, cx cutil.Mem,
	wD *FilterD, w cutil.Mem,
	yD *RNNDataD, y cutil.Mem,
	hyD *TensorD, hy cutil.Mem,
	cyD *TensorD, cy cutil.Mem,
	wspace cutil.Mem, wspacesib uint,
	rspace cutil.Mem, rspacesib uint) error

ForwardTrainingEx - From cudnn documentation This routine is the extended version of the cudnnRNNForwardTraining function. The ForwardTrainingEx allows the user to use unpacked (padded) layout for input x and output y. In the unpacked layout, each sequence in the mini-batch is considered to be of fixed length, specified by maxSeqLength in its corresponding RNNDataDescriptor. Each fixed-length sequence, for example, the nth sequence in the mini-batch, is composed of a valid segment specified by the seqLengthArray[n] in its corresponding RNNDataDescriptor; and a padding segment to make the combined sequence length equal to maxSeqLength. With the unpacked layout, both sequence major (i.e. time major) and batch major are supported. For backward compatibility, the packed sequence major layout is supported. However, similar to the non-extended function cudnnRNNForwardTraining, the sequences

in the mini-batch need to be sorted in descending order according to length.

Parameters:

handle - Input. Handle to a previously created cuDNN context.

xD - Input. A previously initialized RNN Data descriptor. The dataType, layout, maxSeqLength , batchSize, and seqLengthArray need to match that of yD.

x - Input. Data pointer to the GPU memory associated with the RNN data descriptor xD.

The input vectors are expected to be laid out in memory according to the layout specified by xD.
The elements in the tensor (including elements in the padding vector) must be densely packed, and no strides are supported.

hxD - Input. A fully packed tensor descriptor describing the initial hidden state of the RNN.

The first dimension of the tensor depends on the direction argument passed to the cudnnSetRNNDescriptor call used to initialize rnnDesc. Moreover:
If direction is CUDNN_UNIDIRECTIONAL then the first dimension should match the numLayers argument passed to cudnnSetRNNDescriptor.
If direction is CUDNN_BIDIRECTIONAL then the first dimension should match double the numLayers argument passed to cudnnSetRNNDescriptor.
The second dimension must match the batchSize parameter in xD.
The third dimension depends on whether RNN mode is CUDNN_LSTM and whether LSTM projection is enabled. Moreover:
If RNN mode is CUDNN_LSTM and LSTM projection is enabled, the third dimension must match the
recProjSize argument passed to cudnnSetRNNProjectionLayers call used to set rnnDesc.
Otherwise, the third dimension must match the hiddenSize argument passed to the cudnnSetRNNDescriptor call used to initialize rnnDesc .

hx - Input. Data pointer to GPU memory associated with the tensor descriptor hxD.

If a NULL pointer is passed, the initial hidden state of the network will be initialized to zero.

cxD - Input. A fully packed tensor descriptor describing the initial cell state for LSTM networks.

The first dimension of the tensor depends on the direction argument passed to the cudnnSetRNNDescriptor call used to initialize rnnDesc. Moreover:
If direction is CUDNN_UNIDIRECTIONAL the first dimension should match the numLayers argument passed to cudnnSetRNNDescriptor.
If direction is CUDNN_BIDIRECTIONAL the first dimension should match double the numLayers argument passed to cudnnSetRNNDescriptor.
The second dimension must match the first dimension of the tensors described in xD.
The third dimension must match the hiddenSize argument passed to the cudnnSetRNNDescriptor call used to initialize rnnDesc. The tensor must be fully packed.

cx - Input. Data pointer to GPU memory associated with the tensor descriptor cxD. If a NULL pointer

is passed, the initial cell state of the network will be initialized to zero.

wD - Input. Handle to a previously initialized filter descriptor describing the weights for the RNN.

w- Input. Data pointer to GPU memory associated with the filter descriptor wD.

yD - Input. A previously initialized RNN data descriptor. The dataType, layout, maxSeqLength , batchSize, and seqLengthArray

        need to match that of dyD and dxD. The parameter vectorSize depends on whether RNN mode is CUDNN_LSTM and
		whether LSTM projection is enabled and whether the network is bidirectional.
		In specific: For uni-directional network, if RNN mode is CUDNN_LSTM and LSTM projection is enabled,
		the parameter vectorSize must match the recProjSize argument passed to cudnnSetRNNProjectionLayers
		call used to set rnnDesc. If the network is bidirectional, then multiply the value by 2.
		Otherwise, for uni-directional network, the parameter vectorSize must match the
		hiddenSize argument passed to the cudnnSetRNNDescriptor call used
		to initialize rnnDesc. If the network is bidirectional, then multiply the value by 2.

y - Output. Data pointer to GPU memory associated with the RNN data descriptor yD.

The input vectors are expected to be laid out in memory according to the layout
specified by yD. The elements in the tensor (including elements in the padding vector)
must be densely packed, and no strides are supported.

hyD - Input. A fully packed tensor descriptor describing the final hidden state of the RNN. The descriptor must be set exactly the same as hxD.

hy - Output. Data pointer to GPU memory associated with the tensor descriptor hyD. If a NULL pointer is passed, the final hidden state of the network will not be saved.

cyD - Input. A fully packed tensor descriptor describing the final cell state for LSTM networks. The descriptor must be set exactly the same as cxD.

cy- Output. Data pointer to GPU memory associated with the tensor descriptor cyD. If a NULL pointer is passed, the final cell state of the network will be not be saved.

wspace - Input. Data pointer to GPU memory to be used as a wspace for this call.

wspacesib - Input. Specifies the size in bytes of the provided wspace.

rspace -Input/Output. Data pointer to GPU memory to be used as a reserve space for this call.

rspacesib - Input. Specifies the size in bytes of the provided rspace

func (*RNND) ForwardTrainingExUS

func (r *RNND) ForwardTrainingExUS(h *Handle,
	xD *RNNDataD, x unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	cxD *TensorD, cx unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	yD *RNNDataD, y unsafe.Pointer,
	hyD *TensorD, hy unsafe.Pointer,
	cyD *TensorD, cy unsafe.Pointer,
	wspace unsafe.Pointer, wspacesib uint,
	rspace unsafe.Pointer, rspacesib uint) error

ForwardTrainingExUS is like ForwardTrainingEx but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) Get

Get gets RNND values that were set

func (*RNND) GetBiasMode

func (r *RNND) GetBiasMode() (bmode RNNBiasMode, err error)

GetBiasMode gets bias mode for descriptor

func (*RNND) GetClip

func (r *RNND) GetClip(h *Handle) (mode RNNClipMode, nanprop NANProp, lclip, rclip float64, err error)

GetClip returns the clip settings for the descriptor

func (*RNND) GetLinLayerMatrixParams

func (r *RNND) GetLinLayerMatrixParams(
	handle *Handle,
	pseudoLayer int32,

	xD *TensorD,
	wD *FilterD, w cutil.Mem,
	linlayerID int32,

) (FilterD, unsafe.Pointer, error)

GetLinLayerMatrixParams gets the parameters of the layer matrix

func (*RNND) GetLinLayerMatrixParamsUS

func (r *RNND) GetLinLayerMatrixParamsUS(
	handle *Handle,
	pseudoLayer int32,

	xD *TensorD,
	wD *FilterD, w unsafe.Pointer,
	linlayerID int32,

) (FilterD, cutil.Mem, error)

GetLinLayerMatrixParamsUS is like GetLinLayerMatrixParamsUS but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) GetPaddingMode

func (r *RNND) GetPaddingMode() (mode RNNPaddingMode, err error)

GetPaddingMode gets padding mode for the descriptor

func (*RNND) GetParamsSIB

func (r *RNND) GetParamsSIB(
	handle *Handle,
	xD *TensorD,
	data DataType,
) (uint, error)

GetParamsSIB gets the training reserve size

func (*RNND) GetProjectionLayers

func (r *RNND) GetProjectionLayers(
	handle *Handle,
) (int32, int32, error)

GetProjectionLayers sets the rnnprojection layers

func (*RNND) GetRNNLinLayerBiasParams

func (r *RNND) GetRNNLinLayerBiasParams(
	handle *Handle,
	pseudoLayer int32,

	xD *TensorD,
	wD *FilterD,
	w cutil.Mem,
	linlayerID int32,

) (BiasD *FilterD, Bias cutil.Mem, err error)

GetRNNLinLayerBiasParams gets the parameters of the layer bias

func (*RNND) GetRNNLinLayerBiasParamsUS

func (r *RNND) GetRNNLinLayerBiasParamsUS(
	handle *Handle,
	pseudoLayer int32,

	xD *TensorD,
	wD *FilterD,
	w unsafe.Pointer,
	linlayerID int32,

) (BiasD *FilterD, Bias unsafe.Pointer, err error)

GetRNNLinLayerBiasParamsUS is like GetRNNLinLayerBiasParams but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) GetRNNMatrixMathType

func (r *RNND) GetRNNMatrixMathType() (MathType, error)

GetRNNMatrixMathType Gets the math type for the descriptor

func (*RNND) GetReserveSIB

func (r *RNND) GetReserveSIB(
	handle *Handle,
	seqLength int32,
	xD []*TensorD,
) (uint, error)

GetReserveSIB gets the training reserve size

func (*RNND) GetWorkspaceSIB

func (r *RNND) GetWorkspaceSIB(
	handle *Handle,
	seqLength int32,
	xD []*TensorD,
) (uint, error)

GetWorkspaceSIB gets the RNN workspace size (WOW!)

func (*RNND) NewPersistentRNNPlan

func (r *RNND) NewPersistentRNNPlan(minibatch int32, data DataType) (plan *PersistentRNNPlan, err error)

NewPersistentRNNPlan creates and sets a PersistentRNNPlan

func (*RNND) RNNBackwardData

func (r *RNND) RNNBackwardData(
	handle *Handle,
	yD []*TensorD, y cutil.Mem,
	dyD []*TensorD, dy cutil.Mem,
	dhyD *TensorD, dhy cutil.Mem,
	dcyD *TensorD, dcy cutil.Mem,
	wD *FilterD, w cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	cxD *TensorD, cx cutil.Mem,
	dxD []*TensorD, dx cutil.Mem,
	dhxD *TensorD, dhx cutil.Mem,
	dcxD *TensorD, dcx cutil.Mem,
	wspace cutil.Mem, wspacesize uint,
	rspace cutil.Mem, rspacesize uint,
) error

RNNBackwardData is the backward algo for an RNN

func (*RNND) RNNBackwardDataUS

func (r *RNND) RNNBackwardDataUS(
	handle *Handle,
	yD []*TensorD, y unsafe.Pointer,
	dyD []*TensorD, dy unsafe.Pointer,
	dhyD *TensorD, dhy unsafe.Pointer,
	dcyD *TensorD, dcy unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	cxD *TensorD, cx unsafe.Pointer,
	dxD []*TensorD, dx unsafe.Pointer,
	dhxD *TensorD, dhx unsafe.Pointer,
	dcxD *TensorD, dcx unsafe.Pointer,
	wspace unsafe.Pointer, wspacesize uint,
	rspace unsafe.Pointer, rspacesize uint,
) error

RNNBackwardDataUS is like RNNBackwardData but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) RNNForwardInference

func (r *RNND) RNNForwardInference(
	handle *Handle,
	xD []*TensorD, x cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	cxD *TensorD, cx cutil.Mem,
	wD *FilterD, w cutil.Mem,
	yD []*TensorD, y cutil.Mem,
	hyD TensorD, hy cutil.Mem,
	cyD TensorD, cy cutil.Mem,
	wspace cutil.Mem, wspacesize uint,

) error

RNNForwardInference is the forward inference

func (*RNND) RNNForwardInferenceUS

func (r *RNND) RNNForwardInferenceUS(
	handle *Handle,
	xD []*TensorD, x unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	cxD *TensorD, cx unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	yD []*TensorD, y unsafe.Pointer,
	hyD TensorD, hy unsafe.Pointer,
	cyD TensorD, cy unsafe.Pointer,
	wspace unsafe.Pointer, wspacesize uint,

) error

RNNForwardInferenceUS is like RNNForwardInference but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) RNNForwardTraining

func (r *RNND) RNNForwardTraining(
	handle *Handle,
	xD []*TensorD, x cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	cxD *TensorD, cx cutil.Mem,
	wD *FilterD, w cutil.Mem,
	yD []*TensorD, y cutil.Mem,
	hyD *TensorD, hy cutil.Mem,
	cyD *TensorD, cy cutil.Mem,
	wspace cutil.Mem, wspacesize uint,
	rspace cutil.Mem, rspacesize uint,
) error

RNNForwardTraining is the forward algo for an RNN

func (*RNND) RNNForwardTrainingUS

func (r *RNND) RNNForwardTrainingUS(
	handle *Handle,
	xD []*TensorD, x unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	cxD *TensorD, cx unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	yD []*TensorD, y unsafe.Pointer,
	hyD *TensorD, hy unsafe.Pointer,
	cyD *TensorD, cy unsafe.Pointer,
	wspace unsafe.Pointer, wspacesize uint,
	rspace unsafe.Pointer, rspacesize uint,
) error

RNNForwardTrainingUS is like RNNForwardTraining but using unsafe.Pointer instead of cutil.Mem

func (*RNND) Set

func (r *RNND) Set(
	handle *Handle,
	hiddenSize int32,
	numLayers int32,
	doD *DropOutD,
	inputmode RNNInputMode,
	direction DirectionMode,
	rnnmode RNNmode,
	rnnalg RNNAlgo,
	data DataType,

) error

Set sets the rnndesctiptor

func (*RNND) SetAlgorithmDescriptor

func (r *RNND) SetAlgorithmDescriptor(
	handle *Handle,
	algo *AlgorithmD,
) error

SetAlgorithmDescriptor sets the RNNalgorithm

func (*RNND) SetBiasMode

func (r *RNND) SetBiasMode(bmode RNNBiasMode) error

SetBiasMode sets the bias mode for descriptor

func (*RNND) SetClip

func (r *RNND) SetClip(h *Handle, mode RNNClipMode, nanprop NANProp, lclip, rclip float64) error

SetClip sets the clip mode into descriptor

func (*RNND) SetPaddingMode

func (r *RNND) SetPaddingMode(mode RNNPaddingMode) error

SetPaddingMode sets the padding mode with flag passed

func (*RNND) SetProjectionLayers

func (r *RNND) SetProjectionLayers(
	handle *Handle,
	recProjsize int32,
	outProjSize int32,
) error

SetProjectionLayers sets the rnnprojection layers

func (*RNND) SetRNNMatrixMathType

func (r *RNND) SetRNNMatrixMathType(math MathType) error

SetRNNMatrixMathType Sets the math type for the descriptor

type RNNDataD

type RNNDataD struct {
	// contains filtered or unexported fields
}

RNNDataD is a RNNDataDescriptor

func CreateRNNDataD

func CreateRNNDataD() (*RNNDataD, error)

CreateRNNDataD creates an RNNDataD through cudnn's cudnnCreateRNNDataDescriptor This is put into the runtime for GC

func (*RNNDataD) Destroy

func (r *RNNDataD) Destroy() error

Destroy destorys descriptor unless gogc is being used in which it will just return nil

func (*RNNDataD) Get

func (r *RNNDataD) Get() (dtype DataType, layout RNNDataLayout, maxSeqLength, vectorsize int32, seqLengthArray []int32, paddingsymbol float64, err error)

Get gets the parameters used in Set for RNNDataD

func (*RNNDataD) Set

func (r *RNNDataD) Set(dtype DataType, layout RNNDataLayout,
	maxSeqLength, vectorsize int32, seqLengthArray []int32, paddingsymbol float64) error

Set sets the RNNDataD dataType - The datatype of the RNN data tensor. See cudnnDataType_t. layout - The memory layout of the RNN data tensor. maxSeqLength - The maximum sequence length within this RNN data tensor. In the unpacked (padded) layout, this should include the padding vectors in each sequence. In the packed (unpadded) layout, this should be equal to the greatest element in seqLengthArray. vectorSize -The vector length (i.e. embedding size) of the input or output tensor at each timestep. seqLengthArray - An integer array the size of the mini-batch number number of elements. Describes the length (i.e. number of timesteps) of each sequence. Each element in seqLengthArray must be greater than 0 but less than or equal to maxSeqLength. In the packed layout, the elements should be sorted in descending order, similar to the layout required by the non-extended RNN compute functions.

paddingFill - For gocudnn it will auto typecast the value into the correct datatype. Just put the value you want used as an float64.

From Documentation:
A user-defined symbol for filling the padding position in RNN output.
This is only effective when the descriptor is describing the RNN output, and the unpacked layout is specified.
The symbol should be in the host memory, and is interpreted as the same data type as that of the RNN data tensor.

type RNNDataLayout

type RNNDataLayout C.cudnnRNNDataLayout_t

RNNDataLayout are used for flags for data layout

func (*RNNDataLayout) BatchMajorUnPacked

func (r *RNNDataLayout) BatchMajorUnPacked() RNNDataLayout

BatchMajorUnPacked sets r to CUDNN_RNN_DATA_LAYOUT_BATCH_MAJOR_UNPACKED flag

func (*RNNDataLayout) SeqMajorPacked

func (r *RNNDataLayout) SeqMajorPacked() RNNDataLayout

SeqMajorPacked sets r to CUDNN_RNN_DATA_LAYOUT_SEQ_MAJOR_PACKED flag

func (*RNNDataLayout) SeqMajorUnPacked

func (r *RNNDataLayout) SeqMajorUnPacked() RNNDataLayout

SeqMajorUnPacked sets r to and returns CUDNN_RNN_DATA_LAYOUT_SEQ_MAJOR_UNPACKED flag

func (RNNDataLayout) String

func (r RNNDataLayout) String() string

type RNNFlags

type RNNFlags struct {
	Mode      RNNmode
	Algo      RNNAlgo
	Direction DirectionMode
	Input     RNNInputMode
}

RNNFlags holds all the RNN flags

type RNNInputMode

type RNNInputMode C.cudnnRNNInputMode_t

RNNInputMode is used for flags and exposes the different flags through its methods

func (*RNNInputMode) Linear

func (r *RNNInputMode) Linear() RNNInputMode

Linear sets r to and returns RNNInputMode(C.CUDNN_LINEAR_INPUT)

func (*RNNInputMode) Skip

func (r *RNNInputMode) Skip() RNNInputMode

Skip sets r to and returns RNNInputMode(C.CUDNN_SKIP_INPUT)

func (RNNInputMode) String

func (r RNNInputMode) String() string

type RNNPaddingMode

type RNNPaddingMode C.cudnnRNNPaddingMode_t

RNNPaddingMode is the padding mode flag

func (*RNNPaddingMode) Disabled

func (r *RNNPaddingMode) Disabled() RNNPaddingMode

Disabled sets r to and returns RNNPaddingMode(C.CUDNN_RNN_PADDED_IO_DISABLED)

func (*RNNPaddingMode) Enabled

func (r *RNNPaddingMode) Enabled() RNNPaddingMode

Enabled sets r to and returns RNNPaddingMode(C.CUDNN_RNN_PADDED_IO_ENABLED)

func (RNNPaddingMode) String

func (r RNNPaddingMode) String() string

type RNNmode

type RNNmode C.cudnnRNNMode_t

RNNmode is used for flags exposing the flags through methods

func (*RNNmode) Gru

func (r *RNNmode) Gru() RNNmode

Gru sets r to and returns RNNmode(C.CUDNN_GRU)

func (*RNNmode) Lstm

func (r *RNNmode) Lstm() RNNmode

Lstm sets r to and returns RNNmode(C.CUDNN_LSTM)

func (*RNNmode) Relu

func (r *RNNmode) Relu() RNNmode

Relu sets r to and returns RNNMode(C.CUDNN_RNN_RELU)

func (RNNmode) String

func (r RNNmode) String() string

func (*RNNmode) Tanh

func (r *RNNmode) Tanh() RNNmode

Tanh sets r to and returns RNNmode(C.CUDNN_RNN_TANH)

type ReduceTensorD

type ReduceTensorD struct {
	// contains filtered or unexported fields
}

ReduceTensorD is the struct that is used for reduce tensor ops

func CreateReduceTensorDescriptor

func CreateReduceTensorDescriptor() (*ReduceTensorD, error)

CreateReduceTensorDescriptor creates an empry Reduce Tensor Descriptor

func (*ReduceTensorD) Destroy

func (r *ReduceTensorD) Destroy() error

Destroy destroys the reducetensordescriptor

func (*ReduceTensorD) Get

func (r *ReduceTensorD) Get() (reduceop ReduceTensorOp,
	datatype DataType,
	nanprop NANProp,
	reducetensorinds ReduceTensorIndices,
	indicietype IndiciesType, err error)

Get values that were set for r in set

func (*ReduceTensorD) GetIndiciesSize

func (r *ReduceTensorD) GetIndiciesSize(
	handle *Handle,
	aDesc, cDesc *TensorD) (uint, error)

GetIndiciesSize Helper function to return the minimum size in bytes of the index space to be passed to the reduction given the input and output tensors

func (*ReduceTensorD) GetWorkSpaceSize

func (r *ReduceTensorD) GetWorkSpaceSize(
	handle *Handle,
	aDesc, cDesc *TensorD) (uint, error)

GetWorkSpaceSize Helper function to return the minimum size of the workspace to be passed to the reduction given the input and output tensors

func (*ReduceTensorD) ReduceTensorOp

func (r *ReduceTensorD) ReduceTensorOp(
	handle *Handle,
	indices cutil.Mem,
	indiciessize uint,
	wspace cutil.Mem,
	wspacesize uint,
	alpha float64,
	aDesc *TensorD,
	A cutil.Mem,
	beta float64,
	cDesc *TensorD,
	Ce cutil.Mem) error

ReduceTensorOp Tensor operation : C = reduce op( alpha * A ) + beta * C */

The NaN propagation enum applies to only the min and max reduce ops; the other reduce ops propagate NaN as usual.
The indices space is ignored for reduce ops other than min or max.

func (*ReduceTensorD) ReduceTensorOpUS

func (r *ReduceTensorD) ReduceTensorOpUS(
	handle *Handle,
	indices unsafe.Pointer, indiciessize uint,
	wspace unsafe.Pointer, wspacesize uint,
	alpha float64,
	aDesc *TensorD, A unsafe.Pointer,
	beta float64,
	cDesc *TensorD, Ce unsafe.Pointer) error

ReduceTensorOpUS is like ReduceTensorOp but uses unsafe.Pointer instead of cutil.Mem

func (*ReduceTensorD) Set

func (r *ReduceTensorD) Set(reduceop ReduceTensorOp,
	datatype DataType,
	nanprop NANProp,
	reducetensorinds ReduceTensorIndices,
	indicietype IndiciesType) error

Set sets r with the values passed

func (*ReduceTensorD) String

func (r *ReduceTensorD) String() string

String satisfies stringer interface

type ReduceTensorIndices

type ReduceTensorIndices C.cudnnReduceTensorIndices_t

ReduceTensorIndices are used for flags exposed by type's methods

func (*ReduceTensorIndices) FlattenedIndicies

func (r *ReduceTensorIndices) FlattenedIndicies() ReduceTensorIndices

FlattenedIndicies sets r to and returns ReduceTensorIndices(C.CUDNN_REDUCE_TENSOR_FLATTENED_INDICES)

func (*ReduceTensorIndices) NoIndices

NoIndices sets r to and returns ReduceTensorIndices(C.CUDNN_REDUCE_TENSOR_NO_INDICES)

func (ReduceTensorIndices) String

func (r ReduceTensorIndices) String() string

String satisfies stringer interface

type ReduceTensorOp

type ReduceTensorOp C.cudnnReduceTensorOp_t

ReduceTensorOp used for flags for reduce tensor functions

func (*ReduceTensorOp) Add

func (r *ReduceTensorOp) Add() ReduceTensorOp

Add sets r to and returns reduceTensorAdd flag

func (*ReduceTensorOp) Amax

func (r *ReduceTensorOp) Amax() ReduceTensorOp

Amax sets r to and returns reduceTensorAmax flag

func (*ReduceTensorOp) Avg

func (r *ReduceTensorOp) Avg() ReduceTensorOp

Avg sets r to and returns reduceTensorAvg flag

func (*ReduceTensorOp) Max

func (r *ReduceTensorOp) Max() ReduceTensorOp

Max sets r to and returns reduceTensorMax flag

func (*ReduceTensorOp) Min

func (r *ReduceTensorOp) Min() ReduceTensorOp

Min sets r to and returns reduceTensorMin flag

func (*ReduceTensorOp) Mul

func (r *ReduceTensorOp) Mul() ReduceTensorOp

Mul sets r to and returns reduceTensorMul flag

func (*ReduceTensorOp) MulNoZeros

func (r *ReduceTensorOp) MulNoZeros() ReduceTensorOp

MulNoZeros sets r to and returns reduceTensorMulNoZeros flag

func (*ReduceTensorOp) Norm1

func (r *ReduceTensorOp) Norm1() ReduceTensorOp

Norm1 sets r to and returns reduceTensorNorm1 flag

func (*ReduceTensorOp) Norm2

func (r *ReduceTensorOp) Norm2() ReduceTensorOp

Norm2 sets r to and returns reduceTensorNorm2 flag

func (ReduceTensorOp) String

func (r ReduceTensorOp) String() string

String satisfies stringer interface

type Reorder

type Reorder C.cudnnReorderType_t

Reorder is a flag that is changed through its methods

func (*Reorder) Default

func (r *Reorder) Default() Reorder

Default Sets Reorder for inference

func (*Reorder) NoReorder

func (r *Reorder) NoReorder() Reorder

NoReorder changes the flag to noreorder

func (Reorder) String

func (r Reorder) String() string

type RuntimeTag

type RuntimeTag C.cudnnRuntimeTag_t

RuntimeTag is a type that cudnn looks to check or kernels to see if they are working correctly. Should be used with batchnormialization

type SamplerType

type SamplerType C.cudnnSamplerType_t

SamplerType is used for flags

func (*SamplerType) Bilinear

func (s *SamplerType) Bilinear() SamplerType

Bilinear sets s to SamplerType(C.CUDNN_SAMPLER_BILINEAR) and returns new value of s

func (SamplerType) String

func (s SamplerType) String() string

type SeqDataAxis

type SeqDataAxis C.cudnnSeqDataAxis_t

SeqDataAxis is a flag type setting and returning SeqDataAxis flags through methods Caution: Methods will also change the value of variable that calls the method.

If you need to make a case switch make another variable and call it flag and use that.

func (*SeqDataAxis) Batch

func (s *SeqDataAxis) Batch() SeqDataAxis

Batch -index in batch Method sets type to Batch and returns Batch value

func (*SeqDataAxis) Beam

func (s *SeqDataAxis) Beam() SeqDataAxis

Beam -index in beam Method sets type to Beam and returns Beam value

func (*SeqDataAxis) Time

func (s *SeqDataAxis) Time() SeqDataAxis

Time index in time. Method sets type to Time and returns Time value.

func (*SeqDataAxis) Vect

func (s *SeqDataAxis) Vect() SeqDataAxis

Vect -index in Vector Method sets type to Vect and returns Vect value

type SeqDataD

type SeqDataD struct {
	// contains filtered or unexported fields
}

SeqDataD holds C.cudnnSeqDataDescriptor_t

func CreateSeqDataDescriptor

func CreateSeqDataDescriptor() (*SeqDataD, error)

CreateSeqDataDescriptor creates a new SeqDataD

func (*SeqDataD) Destroy

func (s *SeqDataD) Destroy() error

Destroy will destroy the descriptor For now since everything is on the runtime, and will do nothing

func (*SeqDataD) Get

func (s *SeqDataD) Get() (dtype DataType, dimsA []int32, axes []SeqDataAxis, seqLengthArray []int32, paddingfill float64, err error)

Get gets values used in setting up s

func (*SeqDataD) Set

func (s *SeqDataD) Set(dtype DataType, dimsA []int32, axes []SeqDataAxis, seqLengthArray []int32, paddingfill float64) error

Set - from reading the documentation this is what it seems like how you set it up, and the possible work around with gocudnn.

len(dimsA) && len(axes) needs to equal 4. len(seqLengthArray) needs to be < dimsA[(*seqDataAxis).Time()]

dimsA - contains the dims of the buffer that holds a batch of sequence samples. all vals need to be positive.

dimsA[(*seqDataAxis).Time()]=is the maximum allowed sequence length

dimsA[(*seqDataAxis).Batch()]= is the maximum allowed batch size

dimsA[(*seqDataAxis).Beam()]= is the number of beam in each sample

dimsA[(*seqDataAxis).Vect()]= is the vector length.

axes- order in which the axes are in. Needs to be in order of outermost to inner most. Kind of like an NCHW tensor where N is the outer and w is the inner.

Example:

var s SeqDataAxis

axes:=[]SeqDataAxis{s.Batch(), s.Time(),s.Beam(),s.Vect()}

seqLengthArray - Array that holds the sequence lengths of each sequence. paddingfill - Points to a value, of dataType, that is used to fill up the buffer beyond the sequence length of each sequence. The only supported value for paddingFill is 0. paddingfill is autoconverted to the datatype that it needs in the function

type SoftMaxAlgorithm

type SoftMaxAlgorithm C.cudnnSoftmaxAlgorithm_t

SoftMaxAlgorithm is used for flags and are exposed through its methods

func (*SoftMaxAlgorithm) Accurate

func (s *SoftMaxAlgorithm) Accurate() SoftMaxAlgorithm

Accurate changes s to and returns SoftMaxAlgorithm(C.CUDNN_SOFTMAX_ACCURATE)

func (*SoftMaxAlgorithm) Fast

Fast changes s to and returns SoftMaxAlgorithm(C.CUDNN_SOFTMAX_FAST)

func (*SoftMaxAlgorithm) Log

Log changes s to and returns SoftMaxAlgorithm(C.CUDNN_SOFTMAX_LOG)

func (SoftMaxAlgorithm) String

func (s SoftMaxAlgorithm) String() string

type SoftMaxD

type SoftMaxD struct {
	// contains filtered or unexported fields
}

SoftMaxD holds the soft max flags and soft max funcs

func CreateSoftMaxDescriptor

func CreateSoftMaxDescriptor() *SoftMaxD

CreateSoftMaxDescriptor creates a gocudnn softmax descriptor. It is not part of cudnn, but I wanted to make the library A little more stream lined after using it for a while

func (*SoftMaxD) Backward

func (s *SoftMaxD) Backward(
	handle *Handle,
	alpha float64,
	yD *TensorD, y cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	beta float64,
	dxD *TensorD, dx cutil.Mem,
) error

Backward performs the backward softmax

Input/Output: dx

func (*SoftMaxD) BackwardUS

func (s *SoftMaxD) BackwardUS(
	handle *Handle,
	alpha float64,
	yD *TensorD, y unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	beta float64,
	dxD *TensorD, dx unsafe.Pointer,
) error

BackwardUS is like Backward but uses unsafe.Pointer instead of cutil.Mem

func (*SoftMaxD) Forward

func (s *SoftMaxD) Forward(
	handle *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	beta float64,
	yD *TensorD, y cutil.Mem) error

Forward performs forward softmax

Input/Output: y

func (*SoftMaxD) ForwardUS

func (s *SoftMaxD) ForwardUS(
	handle *Handle,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	beta float64,
	yD *TensorD, y unsafe.Pointer) error

ForwardUS is like Forward but uses unsafe.Pointer instead of cutil.Mem

func (*SoftMaxD) Get

func (s *SoftMaxD) Get() (algo SoftMaxAlgorithm, mode SoftMaxMode, err error)

Get gets the softmax descriptor values

func (*SoftMaxD) Set

func (s *SoftMaxD) Set(algo SoftMaxAlgorithm, mode SoftMaxMode) error

Set sets the soft max algos.

func (*SoftMaxD) String

func (s *SoftMaxD) String() string

type SoftMaxMode

type SoftMaxMode C.cudnnSoftmaxMode_t

SoftMaxMode is used for softmaxmode flags and are exposed through its methods

func (*SoftMaxMode) Channel

func (s *SoftMaxMode) Channel() SoftMaxMode

Channel changes s to SoftMaxMode(C.CUDNN_SOFTMAX_MODE_CHANNEL) and returns changed value

func (*SoftMaxMode) Instance

func (s *SoftMaxMode) Instance() SoftMaxMode

Instance changes s to SoftMaxMode(C.CUDNN_SOFTMAX_MODE_INSTANCE) and returns changed value

func (SoftMaxMode) String

func (s SoftMaxMode) String() string

type SpatialTransformerD

type SpatialTransformerD struct {
	// contains filtered or unexported fields
}

SpatialTransformerD holdes the spatial descriptor

func CreateSpatialTransformerDescriptor

func CreateSpatialTransformerDescriptor() (*SpatialTransformerD, error)

CreateSpatialTransformerDescriptor creates the spacial tesnor

func (*SpatialTransformerD) Destroy

func (s *SpatialTransformerD) Destroy() error

Destroy destroys the spatial Transformer Desctiptor. If GC is enable this function won't delete transformer. It will only return nil Since gc is automatically enabled this function is not functional.

func (*SpatialTransformerD) GridGeneratorBackward

func (s *SpatialTransformerD) GridGeneratorBackward(
	handle *Handle,
	grid cutil.Mem,
	theta cutil.Mem,
) error

GridGeneratorBackward - This function generates a grid of coordinates in the input tensor corresponding to each pixel from the output tensor.

func (*SpatialTransformerD) GridGeneratorBackwardUS

func (s *SpatialTransformerD) GridGeneratorBackwardUS(
	handle *Handle,
	grid unsafe.Pointer,
	theta unsafe.Pointer,
) error

GridGeneratorBackwardUS is like GridGeneratorBackward but uses unsafe.Pointer instead of cutil.Mem

func (*SpatialTransformerD) GridGeneratorForward

func (s *SpatialTransformerD) GridGeneratorForward(
	handle *Handle,
	theta cutil.Mem,
	grid cutil.Mem,

) error

GridGeneratorForward This function generates a grid of coordinates in the input tensor corresponding to each pixel from the output tensor.

func (*SpatialTransformerD) GridGeneratorForwardUS

func (s *SpatialTransformerD) GridGeneratorForwardUS(
	handle *Handle,
	theta unsafe.Pointer,
	grid unsafe.Pointer,

) error

GridGeneratorForwardUS is like GridGeneratorForward but uses unsafe.Pointer instead of cutil.Mem

func (*SpatialTransformerD) SamplerBackward

func (s *SpatialTransformerD) SamplerBackward(
	handle *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	beta float64,
	dxD *TensorD, dx cutil.Mem,
	alphaDgrid float64,
	dyD *TensorD, dy cutil.Mem,
	grid cutil.Mem,
	betaDgrid float64,
	dGrid cutil.Mem,
) error

SamplerBackward does the spatial Tranform Sample Backward

func (*SpatialTransformerD) SamplerBackwardUS

func (s *SpatialTransformerD) SamplerBackwardUS(
	handle *Handle,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	beta float64,
	dxD *TensorD, dx unsafe.Pointer,
	alphaDgrid float64,
	dyD *TensorD, dy unsafe.Pointer,
	grid unsafe.Pointer,
	betaDgrid float64,
	dGrid unsafe.Pointer,
) error

SamplerBackwardUS is like SamplerBackward but uses unsafe.Pointer instead of cutil.Mem

func (*SpatialTransformerD) SamplerForward

func (s *SpatialTransformerD) SamplerForward(
	handle *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	grid cutil.Mem,
	beta float64,
	yD *TensorD, y cutil.Mem,
) error

SamplerForward performs the spatialtfsampleforward

func (*SpatialTransformerD) SamplerForwardUS

func (s *SpatialTransformerD) SamplerForwardUS(
	handle *Handle,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	grid unsafe.Pointer,
	beta float64,
	yD *TensorD, y unsafe.Pointer,
) error

SamplerForwardUS is like SamplerForward but uses unsafe.Pointer instead of cutil.Mem

func (*SpatialTransformerD) Set

func (s *SpatialTransformerD) Set(sampler SamplerType, data DataType, dimA []int32) error

Set sets spacial to nd descriptor.

type Status

type Status C.cudnnStatus_t

Status is the status of the cuda dnn

const StatusSuccess Status = 0

StatusSuccess is the zero error of Status. None of the other flags are visable for now, of the Status.error() method

func WrapErrorWithStatus

func WrapErrorWithStatus(e error) (Status, error)

WrapErrorWithStatus if the error string contains a cudnnStatus_t string then it will return the Status and nil, if it doens't the Status will be the flag for CUDNN_STATUS_RUNTIME_FP_OVERFLOW but the error will not return a nil

func (Status) Error

func (status Status) Error(comment string) error

func (Status) String

func (status Status) String() string

String is the function that makes a human readable message

type TensorD

type TensorD struct {
	// contains filtered or unexported fields
}

TensorD holds the cudnnTensorDescriptor. Which is basically the tensor itself

Example

ExampleTensorD shows tomake a tensor

package main

import (
	"fmt"
	"runtime"

	"github.com/dereklstinson/gocudnn/gocu"

	"github.com/dereklstinson/gocudnn/cudart"

	gocudnn "github.com/dereklstinson/gocudnn"
)

func main() {
	//Need to lock os thread.
	runtime.LockOSThread()
	check := func(e error) {
		if e != nil {
			panic(e)
		}
	}
	//Creating a blocking stream
	cs, err := cudart.CreateBlockingStream()
	check(err)
	//Create Device
	dev := cudart.CreateDevice(1)

	//Make an Allocator
	worker := gocu.NewWorker(dev)
	CudaMemManager, err := cudart.CreateMemManager(worker) //cs could be nil .  Check out cudart package on more about streams
	check(err)

	//Tensor
	var tflg gocudnn.TensorFormat //Flag for tensor
	var dtflg gocudnn.DataType    //Flag for tensor

	xD, err := gocudnn.CreateTensorDescriptor()

	// Setting Tensor
	err = xD.Set(tflg.NCHW(), dtflg.Float(), []int32{20, 1, 1, 1}, nil)
	check(err)

	//Gets SIB for tensor memory on device
	xSIB, err := xD.GetSizeInBytes()
	check(err)

	//Allocating memory to device and returning pointer to device memory
	x, err := CudaMemManager.Malloc(xSIB)

	//Create some host mem to copy to cuda memory
	hostmem := make([]float32, xSIB/4)
	//You can fill it
	for i := range hostmem {
		hostmem[i] = float32(i)
	}
	//Convert the slice to GoMem
	hostptr, err := gocu.MakeGoMem(hostmem)

	//Copy hostmem to x
	CudaMemManager.Copy(x, hostptr, xSIB) // This allocotor syncs the cuda stream after every copy.
	// You can make your own custom one. This was a default one
	// to help others get going. Some "extra" functions beyond the api
	// require an allocator.

	//if not using an allocator sync the stream before changing the host mem right after a mem copy.  It could cause problems.
	err = cs.Sync()
	check(err)

	//Zero out the golang host mem.
	for i := range hostmem {
		hostmem[i] = float32(0)
	}

	//do some tensor stuff can return vals to host mem by doing another copy
	err = CudaMemManager.Copy(hostptr, x, xSIB)

	check(err)
	fmt.Println(hostmem)
}
Output:

[0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]

func CreateTensorDescriptor

func CreateTensorDescriptor() (*TensorD, error)

CreateTensorDescriptor creates an empty tensor descriptor

func (*TensorD) DataType

func (t *TensorD) DataType() DataType

DataType returns the datatype of the tensor

func (*TensorD) Destroy

func (t *TensorD) Destroy() error

Destroy destroys the tensor. In future I am going to add a GC setting that will enable or disable the GC. When the GC is disabled It will allow the user more control over memory. right now it does nothing and returns nil

func (*TensorD) Dims

func (t *TensorD) Dims() []int32

Dims returns the shape of the tensor

func (*TensorD) Format

func (t *TensorD) Format() TensorFormat

Format returns the tensor format

func (*TensorD) Get

func (t *TensorD) Get() (frmt TensorFormat, dtype DataType, shape []int32, stride []int32, err error)

Get returns Data Type the Dims for shape and stride and error. for Descriptors without stride it will still return junk info. so be mindful when you code.

func (*TensorD) GetSizeInBytes

func (t *TensorD) GetSizeInBytes() (uint, error)

GetSizeInBytes returns the SizeT in bytes and Status

func (*TensorD) Set

func (t *TensorD) Set(frmt TensorFormat, data DataType, shape, stride []int32) error

Set sets the tensor accourding to the values passed. This is all different than how cudnn does it. In cudnn stride dictates the format of the tensor. Here it will be different. if format is Unknown then strides will dictate the format. If NHWC is chosen then gocudnn will swap things around to make TensorD behave more like FilterD

	Basic 4D formats:

	NCHW:

		  shape[0] = # of batches
		  shape[1] = # of channels
		  shape[2] = height
		  shape[3] = width

	NHWC:

		  shape[0] = # of batches
		  shape[1] = height
		  shape[2] = width
		  shape[3] = # of channels

	Strided:

	Strided is kind of hard to explain.  So here is an example of how values would be placed.
	n, c, h, w := 3,3,256,256 //Here is a batch of 3 images using rgb the size of 256x256
	dims := []int{n, c, h, w}  // Here we have the dims set.
	chw := c * h * w
	hw := h * w
	stride := []int{chw, hw, w, 1}  //This is how stride is usually set.
 //If you wanted to get or place a value at a certain location.
	//Such as:
	//func GetValue(tensor []float32, location [4]int, stride [4]int){
	//l,s:=location,stride
	//return tensor[(l[0]*s[0])+(l[1]*s[1])+(l[2]*s[2])+(l[3]*s[3])] //As you can see the stride changes where you look in the tensor.
	//}

	Notes:

	1) The total size of a tensor including the potential padding between dimensions is limited to 2 Giga-elements of type datatype.
	   Tensors are restricted to having at least 4 dimensions, and at most DimMax (a const with val of 8 at the time of writing this) dimensions.
    When working with lower dimensional data, it is recommended that the user create a 4D tensor, and set the size along unused dimensions to 1.
	2) Stride is ignored if frmt is set to frmt.Strided(). So it can be set to nil.

func (*TensorD) String

func (t *TensorD) String() string

type TensorFormat

type TensorFormat C.cudnnTensorFormat_t

TensorFormat is the type used for flags to set tensor format. Type contains methods that change the value of the type. Caution: Methods will also change the value of variable that calls the method.

If you need to make a case switch make another variable and call it flag and use that.  Look at ToString.

Semi-Custom gocudnn flag. NCHW,NHWC,NCHWvectC come from cudnn. gocudnn adds Strided, and Unknown Reasonings -- Strided - When the tensor is set with strides there is no TensorFormat flag passed. Also cudnnGetTensor4dDescriptor,and cudnnGetTensorNdDescriptor doesn't return the tensor format. Which is really annoying. gocudnn will hide this flag in TensorD so that it can be returned with the tensor. Unknown--Was was made because with at least with the new AttentionD in cudnn V7.5 it will make a descriptor for you. IDK what the tensor format will be. So lets not make an (ASSUME) and mark it with this.

func (*TensorFormat) NCHW

func (t *TensorFormat) NCHW() TensorFormat

NCHW return TensorFormat(C.CUDNN_TENSOR_NCHW) Method sets type and returns new value.

func (*TensorFormat) NCHWvectC

func (t *TensorFormat) NCHWvectC() TensorFormat

NCHWvectC return TensorFormat(C.CUDNN_TENSOR_NCHW_VECT_C) Method sets type and returns new value.

func (*TensorFormat) NHWC

func (t *TensorFormat) NHWC() TensorFormat

NHWC return TensorFormat(C.CUDNN_TENSOR_NHWC) Method sets type and returns new value.

func (TensorFormat) String

func (t TensorFormat) String() string

ToString will return a human readable string that can be printed for debugging.

func (*TensorFormat) Unknown

func (t *TensorFormat) Unknown() TensorFormat

Unknown returns TensorFormat(128). This is custom gocudnn flag. Read TensorFormat notes for explanation. Method sets type and returns new value.

type TransformD

type TransformD struct {
	// contains filtered or unexported fields
}

TransformD holds the transform tensor descriptor

func CreateTransformDescriptor

func CreateTransformDescriptor() (*TransformD, error)

CreateTransformDescriptor creates a transform descriptor

Needs to be Set with Set method.

func (*TransformD) Destroy

func (t *TransformD) Destroy() error

Destroy will destroy tensor if not using GC, but if GC is used then it will do nothing

func (*TransformD) Get

func (t *TransformD) Get() (destFormat TensorFormat, padBefore, padAfter []int32, foldA []uint32, direction FoldingDirection, err error)

Get gets the values of the transform descriptor

func (*TransformD) InitDest

func (t *TransformD) InitDest(src *TensorD) (dest *TensorD, destsib uint, err error)

InitDest This function initializes and returns a destination tensor descriptor destDesc for tensor transform operations. The initialization is done with the desired parameters described in the transform descriptor TensorD. Note: The returned tensor descriptor will be packed.

func (*TransformD) Set

func (t *TransformD) Set(nbDims uint32, destFormat TensorFormat, padBefore, padAfter []int32, foldA []uint32, direction FoldingDirection) error

Set sets the TransformD

padBefore,padAfter,FoldA can be nil if not using any one of those Custom flags for gocudnn added custom flags for TensorFormat will cause an error

func (*TransformD) String

func (t *TransformD) String() string

func (*TransformD) TransformFilter

func (t *TransformD) TransformFilter(h *Handle, alpha float64, srcD *FilterD, src cutil.Mem, beta float64, destD *FilterD, dest cutil.Mem) error

TransformFilter performs transform on filter

func (*TransformD) TransformTensor

func (t *TransformD) TransformTensor(h *Handle, alpha float64, srcD *TensorD, src cutil.Mem, beta float64, destD *TensorD, dest cutil.Mem) error

TransformTensor transforms a tensor according to how TransformD was set

func (*TransformD) TransformTensorUS

func (t *TransformD) TransformTensorUS(h *Handle, alpha float64, srcD *TensorD, src unsafe.Pointer, beta float64, destD *TensorD, dest unsafe.Pointer) error

TransformTensorUS is like TransformTensor but uses unsafe.Pointer instead of cutil.Mem

type WgradMode

type WgradMode C.cudnnWgradMode_t

WgradMode is used for flags and can be changed through methods

func (*WgradMode) Add

func (w *WgradMode) Add() WgradMode

Add sets w to Add and returns Add flag

func (*WgradMode) Set

func (w *WgradMode) Set() WgradMode

Set sets w to Set and returns Set flag

Directories

Path Synopsis
Package cublas - Blas functions for cuda gpus.
Package cublas - Blas functions for cuda gpus.
crtutil
Package crtutil allows cudart to work with Go's io Reader and Writer interfaces.
Package crtutil allows cudart to work with Go's io Reader and Writer interfaces.
Package gocu contains common interfaces to allow the different cuda packages/libraries to intermix with each other and with go.
Package gocu contains common interfaces to allow the different cuda packages/libraries to intermix with each other and with go.
Package xtra is just some functions that use cuda and kernels to make functions that I use that are useful in deep learning.
Package xtra is just some functions that use cuda and kernels to make functions that I use that are useful in deep learning.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL