gocudnn

package module

v0.0.0-...-c9f06ed Latest Latest Go to latest Published: May 13, 2020 License: MIT Imports: 13 Imported by: 28

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/dereklstinson/GoCudnn

Links

Open Source Insights

README ¶

gocudnn

Gopher

V0.1_75_101 is compiling. It is cudnn 7.5 w/ cuda 10.1, There might be bugs. Send me a pull request.

I made a BatchNormalD descriptor and BatchNormDEx descriptor. You will call this with a "Create" function. and set it like the other descriptors.

I also made a deconvoltuion descriptor. It should work. At least I don't receive any errors when doing the operations. Deconvolution works like a convolution except backward data is forward and forward is backward data.
The thing with a deconvolution is that the filter channels will be the output channel, and the filter neurons must match the input channels.

Convolution(Input{N,C,H,W}, Filter{P,C,R,S},Output{N,P,,})

Deconvolution(Input{N,C,H,W}, Filter{C,Q,R,S}, Output{N,Q,,})

gocu folder

The gocu folder contains interfaces that interconnect the different sub packages.
To help parallelize your code use the type Worker. It contains the method work. Where it takes a function at sends it to to be worked on a dedicated thread host thread. Like if you wanted to make a new Context to handle gpu management.

    type GPUcontext struct{
        w *gocu.Worker
        a *crtutil.Allocator
    }
    
    func CreateGPUcontext(dev gocu.Device,s gocu.Streamer)(g *GPUcontext,err error){
        g= new(GPUcontext)
        g.w = new(gocu.Worker)
        err = g.w.Work(func()error{
             g.a = crtutil.CreateAllocator(s)
             return nil
        })
      return err
    }


    func (g *GPUcontext)AllocateMemory(size uint)(c cutil.Mem,err  error){
      err=  g.w.Work(func()error{
            c,err=g.a.AllocateMemory(size)
            return err
        })
    return c,err
    }

cudart/crtutil folder

This folder has a ReadWriter in it. That fulfills the io.Reader and io.Writer interface.

Beta

I don't forsee any code breaking changes. Any changes will be new functions. There will be bugs. Report them or send me a pull request.

Some required packages

go get github.com/dereklstinson/half
go get github.com/dereklstinson/cutil

If I ever go to modules. These will be placed in there.

Setup

cuDNN 7.5 found at or around https://developer.nvidia.com/cudnn

CUDA 10.1 Toolkit found at or around https://developer.nvidia.com/cuda-downloads

Golang V1.13 found at or around https://golang.org/dl/

Will need to set the environmental variables to something along the lines as below.

export PATH=/usr/local/cuda-10.1/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64\
                         ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

export PATH=$PATH:/usr/local/go/bin

I would also like to get this to work on windows, also, but I am finding that windows, go, and cuda don't like to mesh together so well, at least not as intuitive as linux, go, and cuda.

Warnings/Notes

Documentation For cudnn can be found at https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html

Take a good look at chapter 2 to get an idea on how the cudnn library works.

The go bindings will be very similar to how cudnn is coded.

A few exceptions though:.

Most descriptors will be handled with methods after they are created.
All of the "get" functions will return multiple values.

A little more on flag handling

Flags are handled through methods. You must be careful. The methods used with flags will change the flag value. If you don't set the flag with a method. It will default with the initialized value (0). That may or may not be a flag option with cudnn or any of the other packages.

Note on Handles.

CreateHandle() is not thread safe. Lock the thread using runtime.LockOSThread(). If you get this running on a mac. Then your functions will need to be sent to the main thread.

CreateHandleEX() is designed for multiple gpu use. It takes a gocu.Worker and any function that takes the handle will pass that function to the worker. This is still not thread safe, because any gpu memory that the functions use (for the most part) need to be created on that worker. Also, before any memory is created the handle needs to be made.

To parallelize gpus you will need separate handles. Check out parallel_test.go

TensorD FilterD NHWC

I found out the for cudnn it always took NCHW as the dims even if the format was NHWC (oof). To me that didn't seem intuitive. Especially, since it is barely mentioned in the documentation. I am going to to have it so that if the format is chosen to be NHWC then you need to put the dims as NHWC. We will see how that works.

CUBLAS and CUDA additions

Other Notes

I took errors.go from unixpickle/cuda. I really didn't want to have to rewrite that error stuff from the cuda runtime api.

Documentation ¶

Index ¶

Constants
func AddTensor(h *Handle, alpha float64, aD *TensorD, A cutil.Mem, beta float64, cD *TensorD, ...) error
func AddTensorUS(h *Handle, alpha float64, aD *TensorD, A unsafe.Pointer, beta float64, ...) error
func DebugMode()
func FindLength(s uint, dtype DataType) uint32
func FindSizeTfromVol(volume []int32, dtype DataType) uint
func GetBindingVersion() (major, minor, patch int32)
func GetCudaartVersion() uint
func GetFoldedConvBackwardDataDescriptors(h *Handle, filter *FilterD, diff *TensorD, conv *ConvolutionD, grad *TensorD, ...) (foldedfilter *FilterD, paddeddiff *TensorD, foldedConv *ConvolutionD, ...)
func GetLibraryVersion() (major, minor, patch int32, err error)
func GetStringer(tD *TensorD, t cutil.Pointer) (fmt.Stringer, error)
func GetVersion() uint
func ScaleTensor(h *Handle, yD *TensorD, y cutil.Mem, alpha float64) error
func ScaleTensorUS(h *Handle, yD *TensorD, y unsafe.Pointer, alpha float64) error
func SetCallBack(udata fmt.Stringer, w io.Writer) error
func SetTensor(h *Handle, yD *TensorD, y cutil.Mem, v float64) error
func SetTensorUS(h *Handle, yD *TensorD, y unsafe.Pointer, v float64) error
func TransformTensor(h *Handle, alpha float64, xD *TensorD, x cutil.Mem, beta float64, yD *TensorD, ...) error
func TransformTensorUS(h *Handle, alpha float64, xD *TensorD, x unsafe.Pointer, beta float64, ...) error
type ActivationD
- func CreateActivationDescriptor() (*ActivationD, error)
- func (a *ActivationD) Backward(handle *Handle, alpha float64, yD *TensorD, y cutil.Mem, dyD *TensorD, ...) error
- func (a *ActivationD) BackwardUS(handle *Handle, alpha float64, yD *TensorD, y unsafe.Pointer, dyD *TensorD, ...) error
- func (a *ActivationD) Destroy() error
- func (a *ActivationD) Forward(handle *Handle, alpha float64, xD *TensorD, x cutil.Mem, beta float64, ...) error
- func (a *ActivationD) ForwardUS(handle *Handle, alpha float64, xD *TensorD, x unsafe.Pointer, beta float64, ...) error
- func (a *ActivationD) Get() (mode ActivationMode, nan NANProp, coef float64, err error)
- func (a *ActivationD) Set(mode ActivationMode, nan NANProp, coef float64) error
- func (a *ActivationD) String() string
type ActivationMode
- func (a *ActivationMode) ClippedRelu() ActivationMode
- func (a *ActivationMode) Elu() ActivationMode
- func (a *ActivationMode) Identity() ActivationMode
- func (a *ActivationMode) Relu() ActivationMode
- func (a *ActivationMode) Sigmoid() ActivationMode
- func (a ActivationMode) String() string
- func (a *ActivationMode) Tanh() ActivationMode
type Algorithm
type AlgorithmD
- func CreateAlgorithmDescriptor() (*AlgorithmD, error)
- func (a *AlgorithmD) Copy() (*AlgorithmD, error)
- func (a *AlgorithmD) Destroy() error
- func (a *AlgorithmD) Get() (Algorithm, error)
- func (a *AlgorithmD) GetAlgorithmSpaceSize(handle *Handle) (uint, error)
- func (a *AlgorithmD) RestoreAlgorithm(handle *Handle, algoSpace cutil.Mem, sizeinbytes uint) error
- func (a *AlgorithmD) SaveAlgorithm(handle *Handle, algoSpace cutil.Mem, sizeinbytes uint) error
- func (a *AlgorithmD) Set(algo Algorithm) error
type AlgorithmPerformance
- func CreateAlgorithmPerformance(numberToCreate int32) ([]AlgorithmPerformance, error)
- func (a *AlgorithmPerformance) Destroy() error
- func (a *AlgorithmPerformance) Get() (AlgorithmD, Status, float32, uint, error)
- func (a *AlgorithmPerformance) Set(aD *AlgorithmD, s Status, time float32, memory uint) error
type AttentionD
- func CreateAttnDescriptor() (*AttentionD, error)
- func (a *AttentionD) BackwardData(h *Handle, loWinIdx []int32, hiWinIdx []int32, seqLengthArrayDQDO []int32, ...) error
- func (a *AttentionD) BackwardDataUS(h *Handle, loWinIdx []int32, hiWinIdx []int32, seqLengthArrayDQDO []int32, ...) error
- func (a *AttentionD) BackwardWeights(h *Handle, wgmode WgradMode, qDesc *SeqDataD, queries cutil.Mem, ...) error
- func (a *AttentionD) BackwardWeightsUS(h *Handle, wgmode WgradMode, qDesc *SeqDataD, queries unsafe.Pointer, ...) error
- func (a *AttentionD) Destroy() error
- func (a *AttentionD) Forward(h *Handle, currIdx int32, loWinIdx []int32, hiWinIdx []int32, ...) error
- func (a *AttentionD) ForwardUS(h *Handle, currIdx int32, loWinIdx []int32, hiWinIdx []int32, ...) error
- func (a *AttentionD) Get() (qMap AttnQueryMap, nHead int32, smScaler float64, dtype DataType, ...)
- func (a *AttentionD) GetMultiHeadAttnWeights(h *Handle, wkind MultiHeadAttnWeightKind, wbuffSIB uint, wbuff cutil.Mem) (wD *TensorD, w cutil.Mem, err error)
- func (a *AttentionD) GetMultiHeadBuffers(h *Handle) (weightbuffSIB, wspaceSIB, rspaceSIB uint, err error)
- func (a *AttentionD) Set(qMap AttnQueryMap, nHead int32, smScaler float64, dtype DataType, ...) error
type AttnQueryMap
- func (a *AttnQueryMap) AllToOne() AttnQueryMap
- func (a *AttnQueryMap) OneToOne() AttnQueryMap
- func (a AttnQueryMap) String() string
type BatchNormD
- func CreateBatchNormDescriptor() *BatchNormD
- func (b *BatchNormD) Backward(handle *Handle, alphadata, betadata, alphaparam, betaparam float64, ...) error
- func (b *BatchNormD) BackwardUS(handle *Handle, alphadata, betadata, alphaparam, betaparam float64, ...) error
- func (b *BatchNormD) DeriveBNTensorDescriptor(xDesc *TensorD) (bndesc *TensorD, err error)
- func (b *BatchNormD) ForwardInference(handle *Handle, alpha, beta float64, xD *TensorD, x cutil.Mem, yD *TensorD, ...) error
- func (b *BatchNormD) ForwardInferenceUS(handle *Handle, alpha, beta float64, xD *TensorD, x unsafe.Pointer, ...) error
- func (b *BatchNormD) ForwardTraining(handle *Handle, alpha float64, beta float64, xD *TensorD, x cutil.Mem, ...) error
- func (b *BatchNormD) ForwardTrainingUS(handle *Handle, alpha float64, beta float64, xD *TensorD, x unsafe.Pointer, ...) error
- func (b *BatchNormD) Get() (mode BatchNormMode, err error)
- func (b *BatchNormD) MinEpsilon() float64
- func (b *BatchNormD) Set(mode BatchNormMode) error
- func (b *BatchNormD) String() string
type BatchNormDEx
- func CreateBatchNormDescriptorEx() *BatchNormDEx
- func (b *BatchNormDEx) Backward(h *Handle, alphadata, betadata, alphaparam, betaparam float64, xD *TensorD, ...) error
- func (b *BatchNormDEx) BackwardUS(h *Handle, alphadata, betadata, alphaparam, betaparam float64, xD *TensorD, ...) error
- func (b *BatchNormDEx) DeriveBNTensorDescriptor(xDesc *TensorD) (bndesc *TensorD, err error)
- func (b *BatchNormDEx) ForwardInference(handle *Handle, alpha, beta float64, xD *TensorD, x cutil.Mem, yD *TensorD, ...) error
- func (b *BatchNormDEx) ForwardInferenceUS(handle *Handle, alpha, beta float64, xD *TensorD, x unsafe.Pointer, ...) error
- func (b *BatchNormDEx) ForwardTraining(h *Handle, alpha, beta float64, xD *TensorD, x cutil.Mem, zD *TensorD, ...) error
- func (b *BatchNormDEx) ForwardTrainingUS(h *Handle, alpha, beta float64, xD *TensorD, x unsafe.Pointer, zD *TensorD, ...) error
- func (b *BatchNormDEx) GeBackwardWorkspaceSize(h *Handle, xD, yD, dyD, dzD, dxD, dbnScaleBiasMeanVarDesc *TensorD, ...) (wspaceSIB uint, err error)
- func (b *BatchNormDEx) Get() (mode BatchNormMode, op BatchNormOps, err error)
- func (b *BatchNormDEx) GetForwardTrainingWorkspaceSize(h *Handle, mode BatchNormMode, op BatchNormOps, ...) (wspaceSIB uint, err error)
- func (b *BatchNormDEx) GetTrainingReserveSpaceSize(h *Handle, actD *ActivationD, xD *TensorD) (rspaceSIB uint, err error)
- func (b *BatchNormDEx) MinEpsilon() float64
- func (b *BatchNormDEx) Set(mode BatchNormMode, op BatchNormOps) error
- func (b *BatchNormDEx) String() string
type BatchNormMode
- func (b *BatchNormMode) PerActivation() BatchNormMode
- func (b *BatchNormMode) Spatial() BatchNormMode
- func (b *BatchNormMode) SpatialPersistent() BatchNormMode
- func (b BatchNormMode) String() string
type BatchNormOps
- func (b *BatchNormOps) Activation() BatchNormOps
- func (b *BatchNormOps) AddActivation() BatchNormOps
- func (b *BatchNormOps) Normal() BatchNormOps
- func (b BatchNormOps) String() string
type CTCLossAlgo
- func (c CTCLossAlgo) Algo() Algorithm
- func (c *CTCLossAlgo) Deterministic() CTCLossAlgo
- func (c *CTCLossAlgo) NonDeterministic() CTCLossAlgo
- func (c CTCLossAlgo) String() string
type CTCLossD
- func CreateCTCLossDescriptor() (*CTCLossD, error)
- func (c *CTCLossD) CTCLoss(handle *Handle, probsD *TensorD, probs cutil.Mem, labels []int32, ...) error
- func (c *CTCLossD) CTCLossUS(handle *Handle, probsD *TensorD, probs unsafe.Pointer, labels []int32, ...) error
- func (c *CTCLossD) Destroy() error
- func (c *CTCLossD) Get() (DataType, error)
- func (c *CTCLossD) GetWorkspaceSize(handle *Handle, probsD *TensorD, gradientsD *TensorD, labels []int32, ...) (uint, error)
- func (c *CTCLossD) Set(data DataType) error
type ConvBwdDataAlgo
- func (c ConvBwdDataAlgo) Algo() Algorithm
- func (c *ConvBwdDataAlgo) Algo0() ConvBwdDataAlgo
- func (c *ConvBwdDataAlgo) Algo1() ConvBwdDataAlgo
- func (c *ConvBwdDataAlgo) Count() ConvBwdDataAlgo
- func (c *ConvBwdDataAlgo) FFT() ConvBwdDataAlgo
- func (c *ConvBwdDataAlgo) FFTTiling() ConvBwdDataAlgo
- func (c ConvBwdDataAlgo) String() string
- func (c *ConvBwdDataAlgo) Winograd() ConvBwdDataAlgo
- func (c *ConvBwdDataAlgo) WinogradNonFused() ConvBwdDataAlgo
type ConvBwdDataAlgoPerformance
- func (cb ConvBwdDataAlgoPerformance) String() string
type ConvBwdDataPref
- func (c *ConvBwdDataPref) NoWorkSpace() ConvBwdDataPref
- func (c *ConvBwdDataPref) PreferFastest() ConvBwdDataPref
- func (c *ConvBwdDataPref) SpecifyWorkSpaceLimit() ConvBwdDataPref
type ConvBwdFiltAlgo
- func (c ConvBwdFiltAlgo) Algo() Algorithm
- func (c *ConvBwdFiltAlgo) Algo0() ConvBwdFiltAlgo
- func (c *ConvBwdFiltAlgo) Algo1() ConvBwdFiltAlgo
- func (c *ConvBwdFiltAlgo) Algo3() ConvBwdFiltAlgo
- func (c *ConvBwdFiltAlgo) Count() ConvBwdFiltAlgo
- func (c *ConvBwdFiltAlgo) FFT() ConvBwdFiltAlgo
- func (c *ConvBwdFiltAlgo) FFTTiling() ConvBwdFiltAlgo
- func (c ConvBwdFiltAlgo) String() string
- func (c *ConvBwdFiltAlgo) Winograd() ConvBwdFiltAlgo
- func (c *ConvBwdFiltAlgo) WinogradNonFused() ConvBwdFiltAlgo
type ConvBwdFiltAlgoPerformance
- func (cb ConvBwdFiltAlgoPerformance) String() string
type ConvBwdFilterPref
- func (c *ConvBwdFilterPref) NoWorkSpace() ConvBwdFilterPref
- func (c *ConvBwdFilterPref) PreferFastest() ConvBwdFilterPref
- func (c *ConvBwdFilterPref) SpecifyWorkSpaceLimit() ConvBwdFilterPref
type ConvFwdAlgo
- func (c ConvFwdAlgo) Algo() Algorithm
- func (c *ConvFwdAlgo) Count() ConvFwdAlgo
- func (c *ConvFwdAlgo) Direct() ConvFwdAlgo
- func (c *ConvFwdAlgo) FFT() ConvFwdAlgo
- func (c *ConvFwdAlgo) FFTTiling() ConvFwdAlgo
- func (c *ConvFwdAlgo) Gemm() ConvFwdAlgo
- func (c *ConvFwdAlgo) ImplicitGemm() ConvFwdAlgo
- func (c *ConvFwdAlgo) ImplicitPrecompGemm() ConvFwdAlgo
- func (c ConvFwdAlgo) String() string
- func (c *ConvFwdAlgo) WinoGrad() ConvFwdAlgo
- func (c *ConvFwdAlgo) WinoGradNonFused() ConvFwdAlgo
type ConvFwdAlgoPerformance
- func (cb ConvFwdAlgoPerformance) String() string
type ConvolutionD
- func CreateConvolutionDescriptor() (*ConvolutionD, error)
- func (c *ConvolutionD) BackwardBias(handle *Handle, alpha float64, dyD *TensorD, dy cutil.Mem, beta float64, ...) error
- func (c *ConvolutionD) BackwardBiasUS(handle *Handle, alpha float64, dyD *TensorD, dy unsafe.Pointer, beta float64, ...) error
- func (c *ConvolutionD) BackwardData(handle *Handle, alpha float64, wD *FilterD, w cutil.Mem, dyD *TensorD, ...) error
- func (c *ConvolutionD) BackwardDataUS(handle *Handle, alpha float64, wD *FilterD, w unsafe.Pointer, dyD *TensorD, ...) error
- func (c *ConvolutionD) BackwardFilter(handle *Handle, alpha float64, xD *TensorD, x cutil.Mem, dyD *TensorD, ...) error
- func (c *ConvolutionD) BackwardFilterUS(handle *Handle, alpha float64, xD *TensorD, x unsafe.Pointer, dyD *TensorD, ...) error
- func (c *ConvolutionD) BiasActivationForward(handle *Handle, alpha1 float64, xD *TensorD, x cutil.Mem, wD *FilterD, ...) error
- func (c *ConvolutionD) BiasActivationForwardUS(handle *Handle, alpha1 float64, xD *TensorD, x unsafe.Pointer, wD *FilterD, ...) error
- func (c *ConvolutionD) Destroy() error
- func (c *ConvolutionD) FindBackwardDataAlgorithm(handle *Handle, w *FilterD, dy *TensorD, dx *TensorD) ([]ConvBwdDataAlgoPerformance, error)
- func (c *ConvolutionD) FindBackwardDataAlgorithmEx(handle *Handle, wD *FilterD, w cutil.Mem, dyD *TensorD, dy cutil.Mem, ...) ([]ConvBwdDataAlgoPerformance, error)
- func (c *ConvolutionD) FindBackwardDataAlgorithmExUS(handle *Handle, wD *FilterD, w unsafe.Pointer, dyD *TensorD, dy unsafe.Pointer, ...) ([]ConvBwdDataAlgoPerformance, error)
- func (c *ConvolutionD) FindBackwardFilterAlgorithm(handle *Handle, xD *TensorD, dyD *TensorD, dwD *FilterD) ([]ConvBwdFiltAlgoPerformance, error)
- func (c *ConvolutionD) FindBackwardFilterAlgorithmEx(handle *Handle, xD *TensorD, x cutil.Mem, dyD *TensorD, dy cutil.Mem, ...) ([]ConvBwdFiltAlgoPerformance, error)
- func (c *ConvolutionD) FindBackwardFilterAlgorithmExUS(handle *Handle, xD *TensorD, x unsafe.Pointer, dyD *TensorD, dy unsafe.Pointer, ...) ([]ConvBwdFiltAlgoPerformance, error)
- func (c *ConvolutionD) FindForwardAlgorithm(handle *Handle, xD *TensorD, wD *FilterD, yD *TensorD) ([]ConvFwdAlgoPerformance, error)
- func (c *ConvolutionD) FindForwardAlgorithmEx(handle *Handle, xD *TensorD, x cutil.Mem, wD *FilterD, w cutil.Mem, ...) ([]ConvFwdAlgoPerformance, error)
- func (c *ConvolutionD) FindForwardAlgorithmExUS(handle *Handle, xD *TensorD, x unsafe.Pointer, wD *FilterD, w unsafe.Pointer, ...) ([]ConvFwdAlgoPerformance, error)
- func (c *ConvolutionD) Forward(handle *Handle, alpha float64, xD *TensorD, x cutil.Mem, wD *FilterD, ...) error
- func (c *ConvolutionD) ForwardUS(handle *Handle, alpha float64, xD *TensorD, x unsafe.Pointer, wD *FilterD, ...) error
- func (c *ConvolutionD) Get() (mode ConvolutionMode, data DataType, pad []int32, stride []int32, ...)
- func (c *ConvolutionD) GetBackwardDataAlgorithm(handle *Handle, wD *FilterD, dyD *TensorD, dxD *TensorD, pref ConvBwdDataPref, ...) (ConvBwdDataAlgo, error)
- func (c *ConvolutionD) GetBackwardDataAlgorithmV7(handle *Handle, wD *FilterD, dyD *TensorD, dxD *TensorD) ([]ConvBwdDataAlgoPerformance, error)
- func (c *ConvolutionD) GetBackwardDataWorkspaceSize(handle *Handle, wD *FilterD, dyD *TensorD, dxD *TensorD, algo ConvBwdDataAlgo) (uint, error)
- func (c *ConvolutionD) GetBackwardFilterAlgorithm(handle *Handle, xD *TensorD, dyD *TensorD, dwD *FilterD, ...) (ConvBwdFiltAlgo, error)
- func (c *ConvolutionD) GetBackwardFilterAlgorithmV7(handle *Handle, xD *TensorD, dyD *TensorD, dwD *FilterD) ([]ConvBwdFiltAlgoPerformance, error)
- func (c *ConvolutionD) GetBackwardFilterWorkspaceSize(handle *Handle, xD *TensorD, dyD *TensorD, dwD *FilterD, algo ConvBwdFiltAlgo) (uint, error)
- func (c *ConvolutionD) GetForwardAlgorithm(handle *Handle, xD *TensorD, wD *FilterD, yD *TensorD, ...) (ConvFwdAlgo, error)
- func (c *ConvolutionD) GetForwardAlgorithmV7(handle *Handle, xD *TensorD, wD *FilterD, yD *TensorD) ([]ConvFwdAlgoPerformance, error)
- func (c *ConvolutionD) GetForwardWorkspaceSize(handle *Handle, xD *TensorD, wD *FilterD, yD *TensorD, algo ConvFwdAlgo) (uint, error)
- func (c *ConvolutionD) GetOutputDims(input *TensorD, filter *FilterD) ([]int32, error)
- func (c *ConvolutionD) GetReorderType() (r Reorder, err error)
- func (c *ConvolutionD) Im2Col(handle *Handle, xD *TensorD, x cutil.Mem, wD *FilterD, buffer cutil.Mem) error
- func (c *ConvolutionD) Im2ColUS(handle *Handle, xD *TensorD, x unsafe.Pointer, wD *FilterD, ...) error
- func (c *ConvolutionD) Set(mode ConvolutionMode, data DataType, pad, stride, dilation []int32) error
- func (c *ConvolutionD) SetGroupCount(groupCount int32) error
- func (c *ConvolutionD) SetMathType(mathtype MathType) error
- func (c *ConvolutionD) SetReorderType(r Reorder) error
- func (c *ConvolutionD) String() string
type ConvolutionForwardPref
- func (c *ConvolutionForwardPref) NoWorkSpace() ConvolutionForwardPref
- func (c *ConvolutionForwardPref) PreferFastest() ConvolutionForwardPref
- func (c *ConvolutionForwardPref) SpecifyWorkSpaceLimit() ConvolutionForwardPref
type ConvolutionMode
- func (c *ConvolutionMode) Convolution() ConvolutionMode
- func (c *ConvolutionMode) CrossCorrelation() ConvolutionMode
- func (c ConvolutionMode) String() string
type DataType
- func (d *DataType) Double() DataType
- func (d *DataType) Float() DataType
- func (d *DataType) Half() DataType
- func (d *DataType) Int32() DataType
- func (d *DataType) Int8() DataType
- func (d *DataType) Int8x32() DataType
- func (d *DataType) Int8x4() DataType
- func (d DataType) String() string
- func (d *DataType) UInt8() DataType
- func (d *DataType) UInt8x4() DataType
type DeConvBwdDataAlgo
- func (c DeConvBwdDataAlgo) Algo() Algorithm
- func (c *DeConvBwdDataAlgo) Count() DeConvBwdDataAlgo
- func (c *DeConvBwdDataAlgo) Direct() DeConvBwdDataAlgo
- func (c *DeConvBwdDataAlgo) FFT() DeConvBwdDataAlgo
- func (c *DeConvBwdDataAlgo) FFTTiling() DeConvBwdDataAlgo
- func (c *DeConvBwdDataAlgo) Gemm() DeConvBwdDataAlgo
- func (c *DeConvBwdDataAlgo) ImplicitGemm() DeConvBwdDataAlgo
- func (c *DeConvBwdDataAlgo) ImplicitPrecompGemm() DeConvBwdDataAlgo
- func (c DeConvBwdDataAlgo) String() string
- func (c *DeConvBwdDataAlgo) WinoGrad() DeConvBwdDataAlgo
- func (c *DeConvBwdDataAlgo) WinoGradNonFused() DeConvBwdDataAlgo
type DeConvBwdDataAlgoPerformance
- func (cb DeConvBwdDataAlgoPerformance) String() string
type DeConvBwdDataPref
- func (c *DeConvBwdDataPref) NoWorkSpace() DeConvBwdDataPref
- func (c *DeConvBwdDataPref) PreferFastest() DeConvBwdDataPref
- func (c *DeConvBwdDataPref) SpecifyWorkSpaceLimit() DeConvBwdDataPref
type DeConvBwdFiltAlgo
- func (c DeConvBwdFiltAlgo) Algo() Algorithm
- func (c *DeConvBwdFiltAlgo) Algo0() DeConvBwdFiltAlgo
- func (c *DeConvBwdFiltAlgo) Algo1() DeConvBwdFiltAlgo
- func (c *DeConvBwdFiltAlgo) Algo3() DeConvBwdFiltAlgo
- func (c *DeConvBwdFiltAlgo) Count() DeConvBwdFiltAlgo
- func (c *DeConvBwdFiltAlgo) FFT() DeConvBwdFiltAlgo
- func (c *DeConvBwdFiltAlgo) FFTTiling() DeConvBwdFiltAlgo
- func (c DeConvBwdFiltAlgo) String() string
- func (c *DeConvBwdFiltAlgo) Winograd() DeConvBwdFiltAlgo
- func (c *DeConvBwdFiltAlgo) WinogradNonFused() DeConvBwdFiltAlgo
type DeConvBwdFiltAlgoPerformance
- func (cb DeConvBwdFiltAlgoPerformance) String() string
type DeConvBwdFilterPref
- func (c *DeConvBwdFilterPref) NoWorkSpace() DeConvBwdFilterPref
- func (c *DeConvBwdFilterPref) PreferFastest() DeConvBwdFilterPref
- func (c *DeConvBwdFilterPref) SpecifyWorkSpaceLimit() DeConvBwdFilterPref
type DeConvFwdAlgo
- func (c DeConvFwdAlgo) Algo() Algorithm
- func (c *DeConvFwdAlgo) Algo0() DeConvFwdAlgo
- func (c *DeConvFwdAlgo) Algo1() DeConvFwdAlgo
- func (c *DeConvFwdAlgo) Count() DeConvFwdAlgo
- func (c *DeConvFwdAlgo) FFT() DeConvFwdAlgo
- func (c *DeConvFwdAlgo) FFTTiling() DeConvFwdAlgo
- func (c DeConvFwdAlgo) String() string
- func (c *DeConvFwdAlgo) Winograd() DeConvFwdAlgo
- func (c *DeConvFwdAlgo) WinogradNonFused() DeConvFwdAlgo
type DeConvFwdAlgoPerformance
- func (cb DeConvFwdAlgoPerformance) String() string
type DeConvolutionD
- func CreateDeConvolutionDescriptor() (*DeConvolutionD, error)
- func (c *DeConvolutionD) BackwardBias(handle *Handle, alpha float64, dyD *TensorD, dy cutil.Mem, beta float64, ...) error
- func (c *DeConvolutionD) BackwardBiasUS(handle *Handle, alpha float64, dyD *TensorD, dy unsafe.Pointer, beta float64, ...) error
- func (c *DeConvolutionD) BackwardData(handle *Handle, alpha float64, wD *FilterD, w cutil.Mem, dyD *TensorD, ...) error
- func (c *DeConvolutionD) BackwardDataUS(handle *Handle, alpha float64, wD *FilterD, w unsafe.Pointer, dyD *TensorD, ...) error
- func (c *DeConvolutionD) BackwardFilter(handle *Handle, alpha float64, xD *TensorD, x cutil.Mem, dyD *TensorD, ...) error
- func (c *DeConvolutionD) BackwardFilterUS(handle *Handle, alpha float64, xD *TensorD, x unsafe.Pointer, dyD *TensorD, ...) error
- func (c *DeConvolutionD) Destroy() error
- func (c *DeConvolutionD) FindBackwardDataAlgorithm(handle *Handle, w *FilterD, dy *TensorD, dx *TensorD) ([]DeConvBwdDataAlgoPerformance, error)
- func (c *DeConvolutionD) FindBackwardDataAlgorithmEx(handle *Handle, wD *FilterD, w cutil.Mem, dyD *TensorD, dy cutil.Mem, ...) ([]DeConvBwdDataAlgoPerformance, error)
- func (c *DeConvolutionD) FindBackwardDataAlgorithmExUS(handle *Handle, wD *FilterD, w unsafe.Pointer, dyD *TensorD, dy unsafe.Pointer, ...) ([]DeConvBwdDataAlgoPerformance, error)
- func (c *DeConvolutionD) FindBackwardFilterAlgorithm(handle *Handle, xD *TensorD, dyD *TensorD, dwD *FilterD) ([]DeConvBwdFiltAlgoPerformance, error)
- func (c *DeConvolutionD) FindBackwardFilterAlgorithmEx(handle *Handle, xD *TensorD, x cutil.Mem, dyD *TensorD, dy cutil.Mem, ...) ([]DeConvBwdFiltAlgoPerformance, error)
- func (c *DeConvolutionD) FindBackwardFilterAlgorithmExUS(handle *Handle, xD *TensorD, x unsafe.Pointer, dyD *TensorD, dy unsafe.Pointer, ...) ([]DeConvBwdFiltAlgoPerformance, error)
- func (c *DeConvolutionD) FindForwardAlgorithm(handle *Handle, xD *TensorD, wD *FilterD, yD *TensorD) ([]DeConvFwdAlgoPerformance, error)
- func (c *DeConvolutionD) FindForwardAlgorithmEx(handle *Handle, xD *TensorD, x cutil.Mem, wD *FilterD, w cutil.Mem, ...) ([]DeConvFwdAlgoPerformance, error)
- func (c *DeConvolutionD) FindForwardAlgorithmExUS(handle *Handle, xD *TensorD, x unsafe.Pointer, wD *FilterD, w unsafe.Pointer, ...) ([]DeConvFwdAlgoPerformance, error)
- func (c *DeConvolutionD) Forward(handle *Handle, alpha float64, xD *TensorD, x cutil.Mem, wD *FilterD, ...) error
- func (c *DeConvolutionD) ForwardUS(handle *Handle, alpha float64, xD *TensorD, x unsafe.Pointer, wD *FilterD, ...) error
- func (c *DeConvolutionD) Get() (mode ConvolutionMode, data DataType, pad []int32, stride []int32, ...)
- func (c *DeConvolutionD) GetBackwardDataAlgorithm(handle *Handle, wD *FilterD, dyD *TensorD, dxD *TensorD, ...) (DeConvBwdDataAlgo, error)
- func (c *DeConvolutionD) GetBackwardDataAlgorithmV7(handle *Handle, wD *FilterD, dyD *TensorD, dxD *TensorD) ([]DeConvBwdDataAlgoPerformance, error)
- func (c *DeConvolutionD) GetBackwardDataWorkspaceSize(handle *Handle, wD *FilterD, dyD *TensorD, dxD *TensorD, ...) (uint, error)
- func (c *DeConvolutionD) GetBackwardFilterAlgorithm(handle *Handle, xD *TensorD, dyD *TensorD, dwD *FilterD, ...) (DeConvBwdFiltAlgo, error)
- func (c *DeConvolutionD) GetBackwardFilterAlgorithmV7(handle *Handle, xD *TensorD, dyD *TensorD, dwD *FilterD) ([]DeConvBwdFiltAlgoPerformance, error)
- func (c *DeConvolutionD) GetBackwardFilterWorkspaceSize(handle *Handle, xD *TensorD, dyD *TensorD, dwD *FilterD, ...) (uint, error)
- func (c *DeConvolutionD) GetBiasDims(w *FilterD) ([]int32, error)
- func (c *DeConvolutionD) GetForwardAlgorithm(handle *Handle, xD *TensorD, wD *FilterD, yD *TensorD, ...) (DeConvFwdAlgo, error)
- func (c *DeConvolutionD) GetForwardAlgorithmV7(handle *Handle, xD *TensorD, wD *FilterD, yD *TensorD) ([]DeConvFwdAlgoPerformance, error)
- func (c *DeConvolutionD) GetForwardWorkspaceSize(handle *Handle, xD *TensorD, wD *FilterD, yD *TensorD, algo DeConvFwdAlgo) (uint, error)
- func (c *DeConvolutionD) GetOutputDims(input *TensorD, filter *FilterD) ([]int32, error)
- func (c *DeConvolutionD) GetReorderType() (r Reorder, err error)
- func (c *DeConvolutionD) Set(mode ConvolutionMode, data DataType, pad, stride, dilation []int32) error
- func (c *DeConvolutionD) SetGroupCount(groupCount int32) error
- func (c *DeConvolutionD) SetMathType(mathtype MathType) error
- func (c *DeConvolutionD) SetReorderType(r Reorder) error
- func (c *DeConvolutionD) String() string
type DeConvolutionForwardPref
- func (c *DeConvolutionForwardPref) NoWorkSpace() DeConvolutionForwardPref
- func (c *DeConvolutionForwardPref) PreferFastest() DeConvolutionForwardPref
- func (c *DeConvolutionForwardPref) SpecifyWorkSpaceLimit() DeConvolutionForwardPref
type Debug
- func (d *Debug) String() string
type Determinism
- func (d *Determinism) Deterministic() Determinism
- func (d *Determinism) Non() Determinism
- func (d Determinism) String() string
type DirectionMode
- func (r *DirectionMode) Bi() DirectionMode
- func (r DirectionMode) String() string
- func (r *DirectionMode) Uni() DirectionMode
type DivNormMode
- func (d *DivNormMode) PrecomputedMeans() DivNormMode
- func (d DivNormMode) String() string
type DropOutD
- func CreateDropOutDescriptor() (*DropOutD, error)
- func (d *DropOutD) Backward(handle *Handle, dyD *TensorD, dy cutil.Mem, dxD *TensorD, dx cutil.Mem, ...) error
- func (d *DropOutD) BackwardUS(handle *Handle, dyD *TensorD, dy unsafe.Pointer, dxD *TensorD, ...) error
- func (d *DropOutD) Destroy() error
- func (d *DropOutD) Forward(handle *Handle, xD *TensorD, x cutil.Mem, yD *TensorD, y cutil.Mem, ...) error
- func (d *DropOutD) ForwardUS(handle *Handle, xD *TensorD, x unsafe.Pointer, yD *TensorD, y unsafe.Pointer, ...) error
- func (d *DropOutD) Get(handle *Handle) (float32, cutil.Mem, uint64, error)
- func (d *DropOutD) GetReserveSpaceSize(t *TensorD) (uint, error)
- func (d *DropOutD) GetStateSize(handle *Handle) (uint, error)
- func (d *DropOutD) GetUS(handle *Handle) (float32, unsafe.Pointer, uint64, error)
- func (d *DropOutD) Restore(handle *Handle, dropout float32, states cutil.Mem, bytes uint, seed uint64) error
- func (d *DropOutD) RestoreUS(handle *Handle, dropout float32, states unsafe.Pointer, bytes uint, ...) error
- func (d *DropOutD) Set(handle *Handle, dropout float32, states cutil.Mem, bytes uint, seed uint64) error
- func (d *DropOutD) SetUS(handle *Handle, dropout float32, states unsafe.Pointer, bytes uint, ...) error
type ErrQueryMode
- func (e *ErrQueryMode) Blocking() ErrQueryMode
- func (e *ErrQueryMode) NonBlocking() ErrQueryMode
- func (e *ErrQueryMode) RawCode() ErrQueryMode
type FilterD
- func CreateFilterDescriptor() (*FilterD, error)
- func (f *FilterD) Destroy() error
- func (f *FilterD) Get() (dtype DataType, frmt TensorFormat, shape []int32, err error)
- func (f *FilterD) GetSizeInBytes() (uint, error)
- func (f *FilterD) ReorderFilterBias(h *Handle, r Reorder, filtersrc, reorderfilterdest cutil.Mem, reorderbias bool, ...) error
- func (f *FilterD) Set(dtype DataType, format TensorFormat, shape []int32) error
- func (f *FilterD) String() string
type FoldingDirection
- func (f *FoldingDirection) Fold() FoldingDirection
- func (f FoldingDirection) String() string
- func (f *FoldingDirection) UnFold() FoldingDirection
type Handle
- func CreateHandle(usegogc bool) *Handle
- func CreateHandleEX(w *gocu.Worker, usegogc bool) *Handle
- func (handle *Handle) Destroy() error
- func (handle *Handle) GetStream() (gocu.Streamer, error)
- func (handle *Handle) Pointer() unsafe.Pointer
- func (handle *Handle) QueryRuntimeError(mode ErrQueryMode, tag *RuntimeTag) (Status, error)
- func (handle *Handle) SetStream(s gocu.Streamer) error
type IndiciesType
- func (i IndiciesType) String() string
- func (i *IndiciesType) Type16Bit() IndiciesType
- func (i *IndiciesType) Type32Bit() IndiciesType
- func (i *IndiciesType) Type64Bit() IndiciesType
- func (i *IndiciesType) Type8Bit() IndiciesType
type LRND
- func CreateLRNDescriptor() (*LRND, error)
- func (l *LRND) Destroy() error
- func (l *LRND) DivisiveNormalizationBackward(handle *Handle, mode DivNormMode, alpha float64, xD *TensorD, ...) error
- func (l *LRND) DivisiveNormalizationBackwardUS(handle *Handle, mode DivNormMode, alpha float64, xD *TensorD, ...) error
- func (l *LRND) DivisiveNormalizationForward(handle *Handle, mode DivNormMode, alpha float64, xD TensorD, ...) error
- func (l *LRND) DivisiveNormalizationForwardUS(handle *Handle, mode DivNormMode, alpha float64, xD TensorD, ...) error
- func (l *LRND) Get() (lrnN uint32, lrnAlpha float64, lrnBeta float64, lrnK float64, err error)
- func (l *LRND) LRNCrossChannelBackward(handle *Handle, mode LRNmode, alpha float64, yD *TensorD, y cutil.Mem, ...) error
- func (l *LRND) LRNCrossChannelBackwardUS(handle *Handle, mode LRNmode, alpha float64, yD *TensorD, y unsafe.Pointer, ...) error
- func (l *LRND) LRNCrossChannelForward(handle *Handle, mode LRNmode, alpha float64, xD *TensorD, x cutil.Mem, ...) error
- func (l *LRND) LRNCrossChannelForwardUS(handle *Handle, mode LRNmode, alpha float64, xD *TensorD, x unsafe.Pointer, ...) error
- func (l LRND) MaxN() uint32
- func (l LRND) MinBeta() float64
- func (l LRND) MinK() float64
- func (l LRND) MinN() uint32
- func (l *LRND) Set(lrnN uint32, lrnAlpha, lrnBeta, lrnK float64) error
- func (l *LRND) String() string
type LRNmode
- func (l *LRNmode) CrossChanelDim1() LRNmode
- func (l LRNmode) String() string
type MathType
- func (m *MathType) AllowConversion() MathType
- func (m *MathType) Default() MathType
- func (m MathType) String() string
- func (m *MathType) TensorOpMath() MathType
type MultiHeadAttnWeightKind
- func (m *MultiHeadAttnWeightKind) Keys() MultiHeadAttnWeightKind
- func (m *MultiHeadAttnWeightKind) Output() MultiHeadAttnWeightKind
- func (m *MultiHeadAttnWeightKind) Queries() MultiHeadAttnWeightKind
- func (m MultiHeadAttnWeightKind) String() string
- func (m *MultiHeadAttnWeightKind) Values() MultiHeadAttnWeightKind
type NANProp
- func (p *NANProp) NotPropigate() NANProp
- func (p *NANProp) Propigate() NANProp
- func (p NANProp) String() string
type OPTensorD
- func CreateOpTensorDescriptor() (*OPTensorD, error)
- func (t *OPTensorD) Destroy() error
- func (t *OPTensorD) Get() (op OpTensorOp, dtype DataType, nan NANProp, err error)
- func (t *OPTensorD) OpTensor(handle *Handle, alpha1 float64, aD *TensorD, A cutil.Mem, alpha2 float64, ...) error
- func (t *OPTensorD) OpTensorUS(handle *Handle, alpha1 float64, aD *TensorD, A unsafe.Pointer, alpha2 float64, ...) error
- func (t *OPTensorD) Set(op OpTensorOp, dtype DataType, nan NANProp) error
- func (t *OPTensorD) String() string
type OpTensorOp
- func (o *OpTensorOp) Add() OpTensorOp
- func (o *OpTensorOp) Max() OpTensorOp
- func (o *OpTensorOp) Min() OpTensorOp
- func (o *OpTensorOp) Mul() OpTensorOp
- func (o *OpTensorOp) Not() OpTensorOp
- func (o *OpTensorOp) Sqrt() OpTensorOp
- func (o OpTensorOp) String() string
type PersistentRNNPlan
- func (p *PersistentRNNPlan) DestroyPersistentRNNPlan() error
type PoolingD
- func CreatePoolingDescriptor() (*PoolingD, error)
- func (p *PoolingD) Backward(handle *Handle, alpha float64, yD *TensorD, y cutil.Mem, dyD *TensorD, ...) error
- func (p *PoolingD) BackwardUS(handle *Handle, alpha float64, yD *TensorD, y unsafe.Pointer, dyD *TensorD, ...) error
- func (p *PoolingD) Destroy() error
- func (p *PoolingD) Forward(handle *Handle, alpha float64, xD *TensorD, x cutil.Mem, beta float64, ...) error
- func (p *PoolingD) ForwardUS(handle *Handle, alpha float64, xD *TensorD, x unsafe.Pointer, beta float64, ...) error
- func (p *PoolingD) Get() (mode PoolingMode, nan NANProp, window, padding, stride []int32, err error)
- func (p *PoolingD) GetOutputDims(input *TensorD) ([]int32, error)
- func (p *PoolingD) Set(mode PoolingMode, nan NANProp, window, padding, stride []int32) error
- func (p *PoolingD) String() string
type PoolingMode
- func (p *PoolingMode) AverageCountExcludePadding() PoolingMode
- func (p *PoolingMode) AverageCountIncludePadding() PoolingMode
- func (p *PoolingMode) Max() PoolingMode
- func (p *PoolingMode) MaxDeterministic() PoolingMode
- func (p PoolingMode) String() string
type RNNAlgo
- func (r RNNAlgo) Algo() Algorithm
- func (r *RNNAlgo) Count() RNNAlgo
- func (r *RNNAlgo) PersistDynamic() RNNAlgo
- func (r *RNNAlgo) PersistStatic() RNNAlgo
- func (r *RNNAlgo) Standard() RNNAlgo
- func (r RNNAlgo) String() string
type RNNBiasMode
- func (b *RNNBiasMode) Double() RNNBiasMode
- func (b *RNNBiasMode) NoBias() RNNBiasMode
- func (b *RNNBiasMode) SingleINP() RNNBiasMode
- func (b *RNNBiasMode) SingleREC() RNNBiasMode
- func (b RNNBiasMode) String() string
type RNNClipMode
- func (r *RNNClipMode) MinMax() RNNClipMode
- func (r *RNNClipMode) None() RNNClipMode
- func (r RNNClipMode) String() string
type RNND
- func CreateRNNDescriptor() (desc *RNND, err error)
- func (r *RNND) BackwardDataEx(h *Handle, yD *RNNDataD, y cutil.Mem, dyD *RNNDataD, dy cutil.Mem, ...) error
- func (r *RNND) BackwardDataExUS(h *Handle, yD *RNNDataD, y unsafe.Pointer, dyD *RNNDataD, dy unsafe.Pointer, ...) error
- func (r *RNND) BackwardWeights(handle *Handle, xD []*TensorD, x cutil.Mem, hxD *TensorD, hx cutil.Mem, ...) error
- func (r *RNND) BackwardWeightsEx(h *Handle, xD *RNNDataD, x cutil.Mem, hxD *TensorD, hx cutil.Mem, yD *RNNDataD, ...) error
- func (r *RNND) BackwardWeightsExUS(h *Handle, xD *RNNDataD, x unsafe.Pointer, hxD *TensorD, hx unsafe.Pointer, ...) error
- func (r *RNND) BackwardWeightsUS(handle *Handle, xD []*TensorD, x unsafe.Pointer, hxD *TensorD, ...) error
- func (r *RNND) Destroy() error
- func (r *RNND) FindRNNBackwardDataAlgorithmEx(handle *Handle, yD []*TensorD, y cutil.Mem, dyD []*TensorD, dy cutil.Mem, ...) ([]AlgorithmPerformance, error)
- func (r *RNND) FindRNNBackwardDataAlgorithmExUS(handle *Handle, yD []*TensorD, y unsafe.Pointer, dyD []*TensorD, ...) ([]AlgorithmPerformance, error)
- func (r *RNND) FindRNNBackwardWeightsAlgorithmEx(handle *Handle, xD []*TensorD, x cutil.Mem, hxD *TensorD, hx cutil.Mem, ...) ([]AlgorithmPerformance, error)
- func (r *RNND) FindRNNBackwardWeightsAlgorithmExUS(handle *Handle, xD []*TensorD, x unsafe.Pointer, hxD *TensorD, ...) ([]AlgorithmPerformance, error)
- func (r *RNND) FindRNNForwardInferenceAlgorithmEx(handle *Handle, xD []*TensorD, x cutil.Mem, hxD *TensorD, hx cutil.Mem, ...) ([]AlgorithmPerformance, error)
- func (r *RNND) FindRNNForwardInferenceAlgorithmExUS(handle *Handle, xD []*TensorD, x unsafe.Pointer, hxD *TensorD, ...) ([]AlgorithmPerformance, error)
- func (r *RNND) FindRNNForwardTrainingAlgorithmEx(handle *Handle, xD []*TensorD, x cutil.Mem, hxD *TensorD, hx cutil.Mem, ...) ([]AlgorithmPerformance, error)
- func (r *RNND) FindRNNForwardTrainingAlgorithmExUS(handle *Handle, xD []*TensorD, x unsafe.Pointer, hxD *TensorD, ...) ([]AlgorithmPerformance, error)
- func (r *RNND) ForwardInferenceEx(h *Handle, xD *RNNDataD, x cutil.Mem, hxD *TensorD, hx cutil.Mem, cxD *TensorD, ...) error
- func (r *RNND) ForwardInferenceExUS(h *Handle, xD *RNNDataD, x unsafe.Pointer, hxD *TensorD, hx unsafe.Pointer, ...) error
- func (r *RNND) ForwardTrainingEx(h *Handle, xD *RNNDataD, x cutil.Mem, hxD *TensorD, hx cutil.Mem, cxD *TensorD, ...) error
- func (r *RNND) ForwardTrainingExUS(h *Handle, xD *RNNDataD, x unsafe.Pointer, hxD *TensorD, hx unsafe.Pointer, ...) error
- func (r *RNND) Get(handle *Handle) (int32, int32, *DropOutD, RNNInputMode, DirectionMode, RNNmode, RNNAlgo, ...)
- func (r *RNND) GetBiasMode() (bmode RNNBiasMode, err error)
- func (r *RNND) GetClip(h *Handle) (mode RNNClipMode, nanprop NANProp, lclip, rclip float64, err error)
- func (r *RNND) GetLinLayerMatrixParams(handle *Handle, pseudoLayer int32, xD *TensorD, wD *FilterD, w cutil.Mem, ...) (FilterD, unsafe.Pointer, error)
- func (r *RNND) GetLinLayerMatrixParamsUS(handle *Handle, pseudoLayer int32, xD *TensorD, wD *FilterD, w unsafe.Pointer, ...) (FilterD, cutil.Mem, error)
- func (r *RNND) GetPaddingMode() (mode RNNPaddingMode, err error)
- func (r *RNND) GetParamsSIB(handle *Handle, xD *TensorD, data DataType) (uint, error)
- func (r *RNND) GetProjectionLayers(handle *Handle) (int32, int32, error)
- func (r *RNND) GetRNNLinLayerBiasParams(handle *Handle, pseudoLayer int32, xD *TensorD, wD *FilterD, w cutil.Mem, ...) (BiasD *FilterD, Bias cutil.Mem, err error)
- func (r *RNND) GetRNNLinLayerBiasParamsUS(handle *Handle, pseudoLayer int32, xD *TensorD, wD *FilterD, w unsafe.Pointer, ...) (BiasD *FilterD, Bias unsafe.Pointer, err error)
- func (r *RNND) GetRNNMatrixMathType() (MathType, error)
- func (r *RNND) GetReserveSIB(handle *Handle, seqLength int32, xD []*TensorD) (uint, error)
- func (r *RNND) GetWorkspaceSIB(handle *Handle, seqLength int32, xD []*TensorD) (uint, error)
- func (r *RNND) NewPersistentRNNPlan(minibatch int32, data DataType) (plan *PersistentRNNPlan, err error)
- func (r *RNND) RNNBackwardData(handle *Handle, yD []*TensorD, y cutil.Mem, dyD []*TensorD, dy cutil.Mem, ...) error
- func (r *RNND) RNNBackwardDataUS(handle *Handle, yD []*TensorD, y unsafe.Pointer, dyD []*TensorD, ...) error
- func (r *RNND) RNNForwardInference(handle *Handle, xD []*TensorD, x cutil.Mem, hxD *TensorD, hx cutil.Mem, ...) error
- func (r *RNND) RNNForwardInferenceUS(handle *Handle, xD []*TensorD, x unsafe.Pointer, hxD *TensorD, ...) error
- func (r *RNND) RNNForwardTraining(handle *Handle, xD []*TensorD, x cutil.Mem, hxD *TensorD, hx cutil.Mem, ...) error
- func (r *RNND) RNNForwardTrainingUS(handle *Handle, xD []*TensorD, x unsafe.Pointer, hxD *TensorD, ...) error
- func (r *RNND) Set(handle *Handle, hiddenSize int32, numLayers int32, doD *DropOutD, ...) error
- func (r *RNND) SetAlgorithmDescriptor(handle *Handle, algo *AlgorithmD) error
- func (r *RNND) SetBiasMode(bmode RNNBiasMode) error
- func (r *RNND) SetClip(h *Handle, mode RNNClipMode, nanprop NANProp, lclip, rclip float64) error
- func (r *RNND) SetPaddingMode(mode RNNPaddingMode) error
- func (r *RNND) SetProjectionLayers(handle *Handle, recProjsize int32, outProjSize int32) error
- func (r *RNND) SetRNNMatrixMathType(math MathType) error
type RNNDataD
- func CreateRNNDataD() (*RNNDataD, error)
- func (r *RNNDataD) Destroy() error
- func (r *RNNDataD) Get() (dtype DataType, layout RNNDataLayout, maxSeqLength, vectorsize int32, ...)
- func (r *RNNDataD) Set(dtype DataType, layout RNNDataLayout, maxSeqLength, vectorsize int32, ...) error
type RNNDataLayout
- func (r *RNNDataLayout) BatchMajorUnPacked() RNNDataLayout
- func (r *RNNDataLayout) SeqMajorPacked() RNNDataLayout
- func (r *RNNDataLayout) SeqMajorUnPacked() RNNDataLayout
- func (r RNNDataLayout) String() string
type RNNFlags
type RNNInputMode
- func (r *RNNInputMode) Linear() RNNInputMode
- func (r *RNNInputMode) Skip() RNNInputMode
- func (r RNNInputMode) String() string
type RNNPaddingMode
- func (r *RNNPaddingMode) Disabled() RNNPaddingMode
- func (r *RNNPaddingMode) Enabled() RNNPaddingMode
- func (r RNNPaddingMode) String() string
type RNNmode
- func (r *RNNmode) Gru() RNNmode
- func (r *RNNmode) Lstm() RNNmode
- func (r *RNNmode) Relu() RNNmode
- func (r RNNmode) String() string
- func (r *RNNmode) Tanh() RNNmode
type ReduceTensorD
- func CreateReduceTensorDescriptor() (*ReduceTensorD, error)
- func (r *ReduceTensorD) Destroy() error
- func (r *ReduceTensorD) Get() (reduceop ReduceTensorOp, datatype DataType, nanprop NANProp, ...)
- func (r *ReduceTensorD) GetIndiciesSize(handle *Handle, aDesc, cDesc *TensorD) (uint, error)
- func (r *ReduceTensorD) GetWorkSpaceSize(handle *Handle, aDesc, cDesc *TensorD) (uint, error)
- func (r *ReduceTensorD) ReduceTensorOp(handle *Handle, indices cutil.Mem, indiciessize uint, wspace cutil.Mem, ...) error
- func (r *ReduceTensorD) ReduceTensorOpUS(handle *Handle, indices unsafe.Pointer, indiciessize uint, ...) error
- func (r *ReduceTensorD) Set(reduceop ReduceTensorOp, datatype DataType, nanprop NANProp, ...) error
- func (r *ReduceTensorD) String() string
type ReduceTensorIndices
- func (r *ReduceTensorIndices) FlattenedIndicies() ReduceTensorIndices
- func (r *ReduceTensorIndices) NoIndices() ReduceTensorIndices
- func (r ReduceTensorIndices) String() string
type ReduceTensorOp
- func (r *ReduceTensorOp) Add() ReduceTensorOp
- func (r *ReduceTensorOp) Amax() ReduceTensorOp
- func (r *ReduceTensorOp) Avg() ReduceTensorOp
- func (r *ReduceTensorOp) Max() ReduceTensorOp
- func (r *ReduceTensorOp) Min() ReduceTensorOp
- func (r *ReduceTensorOp) Mul() ReduceTensorOp
- func (r *ReduceTensorOp) MulNoZeros() ReduceTensorOp
- func (r *ReduceTensorOp) Norm1() ReduceTensorOp
- func (r *ReduceTensorOp) Norm2() ReduceTensorOp
- func (r ReduceTensorOp) String() string
type Reorder
- func (r *Reorder) Default() Reorder
- func (r *Reorder) NoReorder() Reorder
- func (r Reorder) String() string
type RuntimeTag
type SamplerType
- func (s *SamplerType) Bilinear() SamplerType
- func (s SamplerType) String() string
type SeqDataAxis
- func (s *SeqDataAxis) Batch() SeqDataAxis
- func (s *SeqDataAxis) Beam() SeqDataAxis
- func (s *SeqDataAxis) Time() SeqDataAxis
- func (s *SeqDataAxis) Vect() SeqDataAxis
type SeqDataD
- func CreateSeqDataDescriptor() (*SeqDataD, error)
- func (s *SeqDataD) Destroy() error
- func (s *SeqDataD) Get() (dtype DataType, dimsA []int32, axes []SeqDataAxis, seqLengthArray []int32, ...)
- func (s *SeqDataD) Set(dtype DataType, dimsA []int32, axes []SeqDataAxis, seqLengthArray []int32, ...) error
type SoftMaxAlgorithm
- func (s *SoftMaxAlgorithm) Accurate() SoftMaxAlgorithm
- func (s *SoftMaxAlgorithm) Fast() SoftMaxAlgorithm
- func (s *SoftMaxAlgorithm) Log() SoftMaxAlgorithm
- func (s SoftMaxAlgorithm) String() string
type SoftMaxD
- func CreateSoftMaxDescriptor() *SoftMaxD
- func (s *SoftMaxD) Backward(handle *Handle, alpha float64, yD *TensorD, y cutil.Mem, dyD *TensorD, ...) error
- func (s *SoftMaxD) BackwardUS(handle *Handle, alpha float64, yD *TensorD, y unsafe.Pointer, dyD *TensorD, ...) error
- func (s *SoftMaxD) Forward(handle *Handle, alpha float64, xD *TensorD, x cutil.Mem, beta float64, ...) error
- func (s *SoftMaxD) ForwardUS(handle *Handle, alpha float64, xD *TensorD, x unsafe.Pointer, beta float64, ...) error
- func (s *SoftMaxD) Get() (algo SoftMaxAlgorithm, mode SoftMaxMode, err error)
- func (s *SoftMaxD) Set(algo SoftMaxAlgorithm, mode SoftMaxMode) error
- func (s *SoftMaxD) String() string
type SoftMaxMode
- func (s *SoftMaxMode) Channel() SoftMaxMode
- func (s *SoftMaxMode) Instance() SoftMaxMode
- func (s SoftMaxMode) String() string
type SpatialTransformerD
- func CreateSpatialTransformerDescriptor() (*SpatialTransformerD, error)
- func (s *SpatialTransformerD) Destroy() error
- func (s *SpatialTransformerD) GridGeneratorBackward(handle *Handle, grid cutil.Mem, theta cutil.Mem) error
- func (s *SpatialTransformerD) GridGeneratorBackwardUS(handle *Handle, grid unsafe.Pointer, theta unsafe.Pointer) error
- func (s *SpatialTransformerD) GridGeneratorForward(handle *Handle, theta cutil.Mem, grid cutil.Mem) error
- func (s *SpatialTransformerD) GridGeneratorForwardUS(handle *Handle, theta unsafe.Pointer, grid unsafe.Pointer) error
- func (s *SpatialTransformerD) SamplerBackward(handle *Handle, alpha float64, xD *TensorD, x cutil.Mem, beta float64, ...) error
- func (s *SpatialTransformerD) SamplerBackwardUS(handle *Handle, alpha float64, xD *TensorD, x unsafe.Pointer, beta float64, ...) error
- func (s *SpatialTransformerD) SamplerForward(handle *Handle, alpha float64, xD *TensorD, x cutil.Mem, grid cutil.Mem, ...) error
- func (s *SpatialTransformerD) SamplerForwardUS(handle *Handle, alpha float64, xD *TensorD, x unsafe.Pointer, ...) error
- func (s *SpatialTransformerD) Set(sampler SamplerType, data DataType, dimA []int32) error
type Status
- func WrapErrorWithStatus(e error) (Status, error)
- func (status Status) Error(comment string) error
- func (status Status) String() string
type TensorD
- func CreateTensorDescriptor() (*TensorD, error)
- func (t *TensorD) DataType() DataType
- func (t *TensorD) Destroy() error
- func (t *TensorD) Dims() []int32
- func (t *TensorD) Format() TensorFormat
- func (t *TensorD) Get() (frmt TensorFormat, dtype DataType, shape []int32, stride []int32, err error)
- func (t *TensorD) GetSizeInBytes() (uint, error)
- func (t *TensorD) Set(frmt TensorFormat, data DataType, shape, stride []int32) error
- func (t *TensorD) String() string
type TensorFormat
- func (t *TensorFormat) NCHW() TensorFormat
- func (t *TensorFormat) NCHWvectC() TensorFormat
- func (t *TensorFormat) NHWC() TensorFormat
- func (t TensorFormat) String() string
- func (t *TensorFormat) Unknown() TensorFormat
type TransformD
- func CreateTransformDescriptor() (*TransformD, error)
- func (t *TransformD) Destroy() error
- func (t *TransformD) Get() (destFormat TensorFormat, padBefore, padAfter []int32, foldA []uint32, ...)
- func (t *TransformD) InitDest(src *TensorD) (dest *TensorD, destsib uint, err error)
- func (t *TransformD) Set(nbDims uint32, destFormat TensorFormat, padBefore, padAfter []int32, ...) error
- func (t *TransformD) String() string
- func (t *TransformD) TransformFilter(h *Handle, alpha float64, srcD *FilterD, src cutil.Mem, beta float64, ...) error
- func (t *TransformD) TransformTensor(h *Handle, alpha float64, srcD *TensorD, src cutil.Mem, beta float64, ...) error
- func (t *TransformD) TransformTensorUS(h *Handle, alpha float64, srcD *TensorD, src unsafe.Pointer, beta float64, ...) error
type WgradMode
- func (w *WgradMode) Add() WgradMode
- func (w *WgradMode) Set() WgradMode

Constants ¶

View Source

const BnMinEpsilon = (float64)(C.CUDNN_BN_MIN_EPSILON)

BnMinEpsilon is the min epsilon for batchnorm It used to be 1e-5, but it is now 0

View Source

const CudnnSeqDataDimCount = C.CUDNN_SEQDATA_DIM_COUNT

CudnnSeqDataDimCount is a flag for the number of dims.

View Source

const DimMax = int32(C.CUDNN_DIM_MAX)

DimMax is the max dims for tensors

Variables ¶

This section is empty.

Functions ¶

func AddTensor ¶

func AddTensor(h *Handle, alpha float64, aD *TensorD, A cutil.Mem, beta float64, cD *TensorD, c cutil.Mem) error

AddTensor Tensor Bias addition : C = alpha * A + beta * C // c is both the input and output From Documentation This function adds the scaled values of a bias tensor to another tensor. Each dimension of the bias tensor A must match the corresponding dimension of the destination tensor C or must be equal to 1. In the latter case, the same value from the bias tensor for those dimensions will be used to blend into the C tensor.

**Note: Up to dimension 5, all tensor formats are supported. Beyond those dimensions, this routine is not supported

func AddTensorUS ¶

func AddTensorUS(h *Handle, alpha float64, aD *TensorD, A unsafe.Pointer, beta float64, cD *TensorD, c unsafe.Pointer) error

AddTensorUS is like AddTensor but uses unsafe.Pointer instead of cutil.Mem

func DebugMode ¶

func DebugMode()

DebugMode is for debugging code soley for these bindings.

func FindLength ¶

func FindLength(s uint, dtype DataType) uint32

FindLength returns the length of of the array considering the number of bytes and the Datatype

func FindSizeTfromVol ¶

func FindSizeTfromVol(volume []int32, dtype DataType) uint

FindSizeTfromVol takes a volume of dims and returns the size in bytes in SizeT

func GetBindingVersion ¶

func GetBindingVersion() (major, minor, patch int32)

GetBindingVersion will return the library version this binding was made for.

func GetCudaartVersion ¶

func GetCudaartVersion() uint

GetCudaartVersion prints cuda run time version

func GetFoldedConvBackwardDataDescriptors ¶

func GetFoldedConvBackwardDataDescriptors(h *Handle,
	filter *FilterD,
	diff *TensorD,
	conv *ConvolutionD,
	grad *TensorD,
	transform TensorFormat) (
	foldedfilter *FilterD,
	paddeddiff *TensorD,
	foldedConv *ConvolutionD,
	foldedgrad *TensorD,
	filterfold *TransformD,
	diffpad *TransformD,
	gradfold *TransformD,
	gradunfold *TransformD,
	err error)

GetFoldedConvBackwardDataDescriptors - Hidden Helper function to calculate folding descriptors for dgrad

func GetLibraryVersion ¶

func GetLibraryVersion() (major, minor, patch int32, err error)

GetLibraryVersion will return the library version you have installed

func GetStringer ¶

func GetStringer(tD *TensorD, t cutil.Pointer) (fmt.Stringer, error)

GetStringer returns a stringer that will pring cuda allocated memory formated in NHWC or NCHW. Only works for 4d tensors with float or half datatype. It will only print the data.

func GetVersion ¶

func GetVersion() uint

GetVersion returns the version

func ScaleTensor ¶

func ScaleTensor(h *Handle, yD *TensorD, y cutil.Mem, alpha float64) error

ScaleTensor - Scale all values of a tensor by a given factor : y[i] = alpha * y[i]

func ScaleTensorUS ¶

func ScaleTensorUS(h *Handle, yD *TensorD, y unsafe.Pointer, alpha float64) error

ScaleTensorUS is like ScaleTensor but it uses unsafe.Pointer instead of cutil.Mem

func SetCallBack ¶

func SetCallBack(udata fmt.Stringer, w io.Writer) error

SetCallBack sets the debug callback function. Callback data will be writer to the writer. udata is custom user data that will write to the call back. udata can be nil Callback is not functional.

func SetTensor ¶

func SetTensor(h *Handle, yD *TensorD, y cutil.Mem, v float64) error

SetTensor - Set all values of a tensor to a given value : y[i] = value[0]

func SetTensorUS ¶

func SetTensorUS(h *Handle, yD *TensorD, y unsafe.Pointer, v float64) error

SetTensorUS is like SetTensor but it uses unsafe.Pointer instead of cutil.Mem

func TransformTensor ¶

func TransformTensor(h *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	beta float64,
	yD *TensorD, y cutil.Mem) error

TransformTensor see below

From the SDK Documentation: This function copies the scaled data from one tensor to another tensor with a different layout. Those descriptors need to have the same dimensions but not necessarily the same strides. The input and output tensors must not overlap in any way (i.e., tensors cannot be transformed in place). This function can be used to convert a tensor with an unsupported format to a supported one.

cudnnStatus_t cudnnTransformTensor(

    cudnnHandle_t                  handle,
    const void                    *alpha,
    const cudnnTensorDescriptor_t  xDesc,
    const void                    *x,
    const void                    *beta,
    const cudnnTensorDescriptor_t  yDesc,
	void                          *y)

y = Transfomr((alpha *x),(beta * y)) This will change the layout of a tensor stride wise

func TransformTensorUS ¶

func TransformTensorUS(h *Handle, alpha float64, xD *TensorD, x unsafe.Pointer, beta float64, yD *TensorD, y unsafe.Pointer) error

TransformTensorUS is like TransformTensor but it uses unsafe.Pointer instead of cutil.Mem

Types ¶

type ActivationD ¶

type ActivationD struct {
	// contains filtered or unexported fields
}

ActivationD is an opaque struct that holds the description of an activation operation.

Example ¶

ExampleActivationD of doing the activation function

package main

import (
	"runtime"

	"github.com/dereklstinson/gocudnn/gocu"

	gocudnn "github.com/dereklstinson/gocudnn"
)

func main() {
	runtime.LockOSThread()
	check := func(e error) {
		if e != nil {
			panic(e)
		}
	}

	h := gocudnn.CreateHandle(true) //Using go garbage collector

	ActOp, err := gocudnn.CreateActivationDescriptor()
	check(err)

	var AMode gocudnn.ActivationMode //Activation Mode Flag
	var NanMode gocudnn.NANProp      //Nan Propagation Flag

	err = ActOp.Set(AMode.Relu(), NanMode.Propigate(), 20)
	check(err)
	am, nm, coef, err := ActOp.Get() //Gets the calues that where set
	if am != AMode.Relu() || nm != NanMode.Propigate() || coef != 20 {
		panic("am!=Amode.Relu()||nm !=NanMode.Propigate()||coef!=20")
	}

	//Dummy Variables
	//Check TensorD to find out how to make xD,yD and x and y
	var x, y *gocu.CudaPtr
	var xD, yD *gocudnn.TensorD

	err = ActOp.Forward(h, 1, xD, x, 0, yD, y)
	check(err)
}

Output:

func CreateActivationDescriptor ¶

func CreateActivationDescriptor() (*ActivationD, error)

CreateActivationDescriptor creates an activation descriptor

func (*ActivationD) Backward ¶

func (a *ActivationD) Backward(
	handle *Handle,
	alpha float64,
	yD *TensorD, y cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	xD *TensorD, x cutil.Mem,
	beta float64,
	dxD *TensorD, dx cutil.Mem) error

Backward does the activation backward method

From deep learning sdk documentation (slightly modified for gocudnn):

This routine computes the gradient of a neuron activation function.

Note: In-place operation is allowed for this routine; i.e., dx and dy cutil.Mem may be equal. However, this requires dxD and dyD descriptors to be identical (particularly, the strides of the input and output must match for in-place operation to be allowed).

Note: All tensor formats are supported for 4 and 5 dimensions, however best performance is obtained when the strides of dxD and dyD are equal and HW-packed. For more than 5 dimensions the tensors must have their spatial dimensions packed.

Parameters:

---
handle(input):

previously created Handle
---
----
alpha, beta(input):

Pointers to scaling factors (in host memory) used to blend the computation result with prior
value in the output layer as follows: dstValue = alpha[0]*result + beta[0]*priorDstValue.
----
---
xD(input):

Handle to the previously initialized input tensor descriptor.
---
----
x(input):

Data pointer to GPU memory associated with the tensor descriptor xD.
----
---
dxD(input):

Handle to the previously initialized input tensor descriptor.
---
----
dx(output):

Data pointer to GPU memory associated with the tensor descriptor dxD.
----
---
yD(input):

Handle to the previously initialized output tensor descriptor.
---
----
y(input):

Data pointer to GPU memory associated with the output tensor descriptor yD.
----
---
dyD(input):

Handle to the previously initialized output tensor descriptor.
---
----
dy(input):

Data pointer to GPU memory associated with the output tensor descriptor dyD.
----

Possible Error Returns

	nil:

	The function launched successfully.

	CUDNN_STATUS_NOT_SUPPORTED:

	1) The dimensions n,c,h,w of the input tensor and output tensors differ.
 2) The datatype of the input tensor and output tensors differs.
 3) The strides nStride, cStride, hStride, wStride of the input tensor and the input differential tensor differ.
	4) The strides nStride, cStride, hStride, wStride of the output tensor and the output differential tensor differ.

	CUDNN_STATUS_BAD_PARAM:

	At least one of the following conditions are met:

	The strides nStride, cStride, hStride, wStride of the input differential tensor and output
	differential tensors differ and in-place operation is used.

	CUDNN_STATUS_EXECUTION_FAILED:

	The function failed to launch on the GPU.

func (*ActivationD) BackwardUS ¶

func (a *ActivationD) BackwardUS(
	handle *Handle,
	alpha float64,
	yD *TensorD, y unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	xD *TensorD, x unsafe.Pointer,
	beta float64,
	dxD *TensorD, dx unsafe.Pointer) error

BackwardUS is just like Backward but it takes unsafe.Pointers instead of cutil.Mem

func (*ActivationD) Destroy ¶

func (a *ActivationD) Destroy() error

Destroy destroys the activation descriptor if GC is not set. if not set method will only return nil Currently GC is always set with no way of turning it off

func (*ActivationD) Forward ¶

func (a *ActivationD) Forward(
	handle *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	beta float64,
	yD *TensorD, y cutil.Mem) error

Forward does the forward activation function

From deep learning sdk documentation (slightly modified for gocudnn):

This routine applies a specified neuron activation function element-wise over each input value.

Note: In-place operation is allowed for this routine; i.e., x and y cutil.Mem may be equal. However, this requires xD and yD descriptors to be identical (particularly, the strides of the input and output must match for in-place operation to be allowed).

Note: All tensor formats are supported for 4 and 5 dimensions, however best performance is obtained when the strides of xD and yD are equal and HW-packed. For more than 5 dimensions the tensors must have their spatial dimensions packed.

Parameters:

---
handle(input):

previously created Handle
---
----
alpha, beta(input):

Pointers to scaling factors (in host memory) used to blend the computation result with prior
value in the output layer as follows: dstValue = alpha[0]*result + beta[0]*priorDstValue.
----
---
xD(input):

Handle to the previously initialized input tensor descriptor.
---
----
x(input):

Data pointer to GPU memory associated with the tensor descriptor xD.

----
---
yD(input):

Handle to the previously initialized output tensor descriptor.
---
----
y(output):

Data pointer to GPU memory associated with the output tensor descriptor yDesc.
----

Possible Error Returns

nil:

The function launched successfully.

CUDNN_STATUS_NOT_SUPPORTED:

The function does not support the provided configuration.

CUDNN_STATUS_BAD_PARAM:

At least one of the following conditions are met:

1)The parameter mode has an invalid enumerant value.
2)The dimensions n,c,h,w of the input tensor and output tensors differ.
3)The datatype of the input tensor and output tensors differs.
4)The strides nStride,cStride,hStride,wStride of the input tensor and output tensors differ and in-place operation is used (i.e., x and y pointers are equal).

CUDNN_STATUS_EXECUTION_FAILED:

The function failed to launch on the GPU.

func (*ActivationD) ForwardUS ¶

func (a *ActivationD) ForwardUS(
	handle *Handle,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	beta float64,
	yD *TensorD, y unsafe.Pointer) error

ForwardUS is just like Forward but it takes unsafe.Pointers instead of cutil.Mem

func (*ActivationD) Get ¶

func (a *ActivationD) Get() (mode ActivationMode, nan NANProp, coef float64, err error)

Get gets the descriptor descriptors values

func (*ActivationD) Set ¶

func (a *ActivationD) Set(mode ActivationMode, nan NANProp, coef float64) error

Set sets the activation operation according to the settings passed

func (*ActivationD) String ¶

func (a *ActivationD) String() string

type ActivationMode ¶

type ActivationMode C.cudnnActivationMode_t

ActivationMode is used for activation discriptor flags flags are obtained through type's methods

func (*ActivationMode) ClippedRelu ¶

func (a *ActivationMode) ClippedRelu() ActivationMode

ClippedRelu sets a to ActivationMode(C.CUDNN_ACTIVATION_CLIPPED_RELU)and returns that value.

Selects the clipped rectified linear function.

func (*ActivationMode) Elu ¶

func (a *ActivationMode) Elu() ActivationMode

Elu sets a to ActivationMode(C.CUDNN_ACTIVATION_ELU) and returns that value.

Selects the exponential linear function.

func (*ActivationMode) Identity ¶

func (a *ActivationMode) Identity() ActivationMode

Identity returns ActivationMode(C.CUDNN_ACTIVATION_IDENTITY) (new for 7.1)

Selects the identity function, intended for bypassing the activation step in (*Convolution)BiasActivationForward(). (The Identity flag must use CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM, and only for (*Convolution)BiasActivationForward()) Does not work with cudnnActivationForward() or cudnnActivationBackward().

func (*ActivationMode) Relu ¶

func (a *ActivationMode) Relu() ActivationMode

Relu sets a to ActivationMode(C.CUDNN_ACTIVATION_RELU)and returns that value.

Selects the rectified linear function.

func (*ActivationMode) Sigmoid ¶

func (a *ActivationMode) Sigmoid() ActivationMode

Sigmoid sets a to ActivationMode(C.CUDNN_ACTIVATION_SIGMOID)and returns that value.

Selects the sigmoid function.

func (ActivationMode) String ¶

func (a ActivationMode) String() string

func (*ActivationMode) Tanh ¶

func (a *ActivationMode) Tanh() ActivationMode

Tanh sets a to ActivationMode(C.CUDNN_ACTIVATION_TANH)and returns that value.

Selects the hyperbolic tangent function.

type Algorithm ¶

type Algorithm C.cudnnAlgorithm_t

Algorithm is used to pass generic stuff

type AlgorithmD ¶

type AlgorithmD struct {
	// contains filtered or unexported fields
}

AlgorithmD holds the C.cudnnAlgorithmDescriptor_t

func CreateAlgorithmDescriptor ¶

func CreateAlgorithmDescriptor() (*AlgorithmD, error)

CreateAlgorithmDescriptor creates an AlgorithmD that needs to be set

func (*AlgorithmD) Copy ¶

func (a *AlgorithmD) Copy() (*AlgorithmD, error)

Copy returns a copy of AlgorithmD

func (*AlgorithmD) Destroy ¶

func (a *AlgorithmD) Destroy() error

Destroy destroys descriptor. Right now since gocudnn is on go's gc this won't do anything

func (*AlgorithmD) Get ¶

func (a *AlgorithmD) Get() (Algorithm, error)

Get returns AlgrothmD values a Algorithm.

func (*AlgorithmD) GetAlgorithmSpaceSize ¶

func (a *AlgorithmD) GetAlgorithmSpaceSize(handle *Handle) (uint, error)

GetAlgorithmSpaceSize gets the size in bytes of the algorithm

func (*AlgorithmD) RestoreAlgorithm ¶

func (a *AlgorithmD) RestoreAlgorithm(handle *Handle, algoSpace cutil.Mem, sizeinbytes uint) error

RestoreAlgorithm from host

func (*AlgorithmD) SaveAlgorithm ¶

func (a *AlgorithmD) SaveAlgorithm(handle *Handle, algoSpace cutil.Mem, sizeinbytes uint) error

SaveAlgorithm saves the algorithm to host

func (*AlgorithmD) Set ¶

func (a *AlgorithmD) Set(algo Algorithm) error

Set sets the algorthm into the algorithmd

type AlgorithmPerformance ¶

type AlgorithmPerformance struct {
	// contains filtered or unexported fields
}

AlgorithmPerformance go typed C.cudnnAlgorithmPerformance_t

func CreateAlgorithmPerformance ¶

func CreateAlgorithmPerformance(numberToCreate int32) ([]AlgorithmPerformance, error)

CreateAlgorithmPerformance creates and returns an AlgorithmPerformance

returns

nil = Sucess
CUDNN_STATUS_ALLOC_FAILED - The resources could not be allocated

func (*AlgorithmPerformance) Destroy ¶

func (a *AlgorithmPerformance) Destroy() error

Destroy destroys the perfmance

func (*AlgorithmPerformance) Get ¶

func (a *AlgorithmPerformance) Get() (AlgorithmD, Status, float32, uint, error)

Get gets algorithm performance. it returns AlgorithmD, Status, float32(time), SizeT(memorysize in bytes) I didn't include the setalgorithmperformance func, but it might need to be made.

func (*AlgorithmPerformance) Set ¶

func (a *AlgorithmPerformance) Set(aD *AlgorithmD, s Status, time float32, memory uint) error

Set sets the algo performance

type AttentionD ¶

type AttentionD struct {
	// contains filtered or unexported fields
}

AttentionD holds opaque values used for attention operations

func CreateAttnDescriptor ¶

func CreateAttnDescriptor() (*AttentionD, error)

CreateAttnDescriptor creates an Attention Descriptor

func (*AttentionD) BackwardData ¶

func (a *AttentionD) BackwardData(
	h *Handle,
	loWinIdx []int32,
	hiWinIdx []int32,
	seqLengthArrayDQDO []int32,
	seqLengthArrayDKDV []int32,
	doDesc *SeqDataD, dout cutil.Mem,
	dqDesc *SeqDataD, dqueries, queries cutil.Mem,
	dkDesc *SeqDataD, dkeys, keys cutil.Mem,
	dvDesc *SeqDataD, dvalues, values cutil.Mem,
	wbuffSIB uint, wbuff cutil.Mem, wspaceSIB uint, wspace cutil.Mem, rspaceSIB uint, rspace cutil.Mem) error

BackwardData does the backward propigation for data.

func (*AttentionD) BackwardDataUS ¶

func (a *AttentionD) BackwardDataUS(
	h *Handle,
	loWinIdx []int32,
	hiWinIdx []int32,
	seqLengthArrayDQDO []int32,
	seqLengthArrayDKDV []int32,
	doDesc *SeqDataD, dout unsafe.Pointer,
	dqDesc *SeqDataD, dqueries, queries unsafe.Pointer,
	dkDesc *SeqDataD, dkeys, keys unsafe.Pointer,
	dvDesc *SeqDataD, dvalues, values unsafe.Pointer,
	wbuffSIB uint, wbuff unsafe.Pointer, wspaceSIB uint, wspace unsafe.Pointer, rspaceSIB uint, rspace unsafe.Pointer) error

BackwardDataUS is like BackwardData but uses unsafe.Pointer instead of cutil.Mem

func (*AttentionD) BackwardWeights ¶

func (a *AttentionD) BackwardWeights(
	h *Handle,
	wgmode WgradMode,
	qDesc *SeqDataD, queries cutil.Mem,
	keyDesc *SeqDataD, keys cutil.Mem,
	vDesc *SeqDataD, values cutil.Mem,
	doDesc *SeqDataD, dout cutil.Mem,
	wbuffSIB uint, wbuff, dwbuff cutil.Mem,
	wspaceSIB uint, wspace cutil.Mem, rspaceSIB uint, rspace cutil.Mem) error

BackwardWeights does the backward propigation for weights.

func (*AttentionD) BackwardWeightsUS ¶

func (a *AttentionD) BackwardWeightsUS(
	h *Handle,
	wgmode WgradMode,
	qDesc *SeqDataD, queries unsafe.Pointer,
	keyDesc *SeqDataD, keys unsafe.Pointer,
	vDesc *SeqDataD, values unsafe.Pointer,
	doDesc *SeqDataD, dout unsafe.Pointer,
	wbuffSIB uint, wbuff, dwbuff unsafe.Pointer,
	wspaceSIB uint, wspace unsafe.Pointer, rspaceSIB uint, rspace unsafe.Pointer) error

BackwardWeightsUS is like BackwardWeightsUS but uses unsafe.Pointer instead of cutil.Mem

func (*AttentionD) Destroy ¶

func (a *AttentionD) Destroy() error

Destroy will destroy the descriptor if not on GC if it is on gc it will do nothing but return nil Currently, gocudnn is always on go's gc

func (*AttentionD) Forward ¶

func (a *AttentionD) Forward(
	h *Handle,
	currIdx int32,
	loWinIdx []int32,
	hiWinIdx []int32,
	seqLengthArrayQRO []int32,
	seqLengthArrayKV []int32,
	qrDesc *SeqDataD, queries, residuals cutil.Mem,
	keyDesc *SeqDataD, keys cutil.Mem,
	vDesc *SeqDataD, values cutil.Mem,
	oDesc *SeqDataD, out cutil.Mem,
	wbuffSIB uint, wbuff cutil.Mem,
	wspaceSIB uint, wspace cutil.Mem,
	rspaceSIB uint, rspace cutil.Mem) error

Forward look at documentation. Kind of more confusing than normal if currIdx <0 trainingmode, currIdx >=0 inference mode

func (*AttentionD) ForwardUS ¶

func (a *AttentionD) ForwardUS(
	h *Handle,
	currIdx int32,
	loWinIdx []int32,
	hiWinIdx []int32,
	seqLengthArrayQRO []int32,
	seqLengthArrayKV []int32,
	qrDesc *SeqDataD, queries, residuals unsafe.Pointer,
	keyDesc *SeqDataD, keys unsafe.Pointer,
	vDesc *SeqDataD, values unsafe.Pointer,
	oDesc *SeqDataD, out unsafe.Pointer,
	wbuffSIB uint, wbuff unsafe.Pointer,
	wspaceSIB uint, wspace unsafe.Pointer,
	rspaceSIB uint, rspace unsafe.Pointer) error

ForwardUS is like Forward but takes unsafe.Pointer's instead of cutil.Mem

func (*AttentionD) Get ¶

func (a *AttentionD) Get() (
	qMap AttnQueryMap,
	nHead int32,
	smScaler float64,
	dtype DataType,
	computePrecision DataType,
	mtype MathType,
	attn *DropOutD,
	post *DropOutD,
	qSize, keySize, vSize int32,
	qProjSize, keyProjSize, vProjSize, oProjSize int32,
	qoMaxSeqLen, kvMaxSeqLen int32,
	maxBatchSize, maxBeamSize int32,
	err error)

Get gets all the values for the AttentionD - There is a lot.

func (*AttentionD) GetMultiHeadAttnWeights ¶

func (a *AttentionD) GetMultiHeadAttnWeights(h *Handle, wkind MultiHeadAttnWeightKind, wbuffSIB uint, wbuff cutil.Mem) (wD *TensorD, w cutil.Mem, err error)

GetMultiHeadAttnWeights returns a Descripter for w and its goco.Mem

func (*AttentionD) GetMultiHeadBuffers ¶

func (a *AttentionD) GetMultiHeadBuffers(h *Handle) (weightbuffSIB, wspaceSIB, rspaceSIB uint, err error)

GetMultiHeadBuffers returns the Size In Bytes (SIB) needed for allocation for operation.

func (*AttentionD) Set ¶

func (a *AttentionD) Set(
	qMap AttnQueryMap,
	nHead int32,
	smScaler float64,
	dtype DataType,
	computePrecision DataType,
	mtype MathType,
	attn *DropOutD,
	post *DropOutD,
	qSize, keySize, vSize int32,
	qProjSize, keyProjSize, vProjSize, oProjSize int32,
	qoMaxSeqLen, kvMaxSeqLen int32,
	maxBatchSize, maxBeamSize int32,
) error

Set sets an already made AttentionD called from CreateAttnDescriptor.

type AttnQueryMap ¶

type AttnQueryMap C.cudnnAttnQueryMap_t

AttnQueryMap type is a flag for multihead attention. Flags are exposed through type methods.

func (*AttnQueryMap) AllToOne ¶

func (a *AttnQueryMap) AllToOne() AttnQueryMap

AllToOne - multiple Q-s when beam width > 1 map to a single (K,V) set. Method changes to AllToOne, and returns that value.

func (*AttnQueryMap) OneToOne ¶

func (a *AttnQueryMap) OneToOne() AttnQueryMap

OneToOne - multiple Q-s when beam width > 1 map to corresponding (K,V) sets. Method changes to OneToOne, and returns that value.

func (AttnQueryMap) String ¶

func (a AttnQueryMap) String() string

type BatchNormD ¶

type BatchNormD struct {
	// contains filtered or unexported fields
}

BatchNormD is a gocudnn original. This is to make the batchnorm operation similar to the majority cudnn.

func CreateBatchNormDescriptor ¶

func CreateBatchNormDescriptor() *BatchNormD

CreateBatchNormDescriptor creates a new BatchNormD

func (*BatchNormD) Backward ¶

func (b *BatchNormD) Backward(
	handle *Handle,
	alphadata, betadata, alphaparam, betaparam float64,
	xD *TensorD, x cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	dxD *TensorD, dx cutil.Mem,
	dBnScaleBiasDesc *TensorD, scale, dscale, dbias cutil.Mem,
	epsilon float64,

	savedMean, savedInvVariance cutil.Mem,
) error

Backward - Performs backward pass of Batch Normalization layer.

 Outputs: dx (backprop data), dscale (training scale), dbias (training bias)

	Scalars: alphadata, betadata, alphaparam, betaparam - are smoothing factors. y = alpha * operation + beta * y

	Note: savedMean, savedInvVariance - These are cached results if used by the layer in the forward pass.
					    These can be null iff they are both null.

func (*BatchNormD) BackwardUS ¶

func (b *BatchNormD) BackwardUS(
	handle *Handle,
	alphadata, betadata, alphaparam, betaparam float64,
	xD *TensorD, x unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	dxD *TensorD, dx unsafe.Pointer,
	dBnScaleBiasDesc *TensorD, scale, dscale, dbias unsafe.Pointer,
	epsilon float64,

	savedMean, savedInvVariance unsafe.Pointer,
) error

BackwardUS is like Backward but uses unsafe.Pointers instead of cutil.Mem

func (*BatchNormD) DeriveBNTensorDescriptor ¶

func (b *BatchNormD) DeriveBNTensorDescriptor(xDesc *TensorD) (bndesc *TensorD, err error)

DeriveBNTensorDescriptor Derives a BN Tensor Descriptor from the one passed.

* Derives a tensor descriptor from layer data descriptor for BatchNormalization * scale, invVariance, bnBias, bnScale tensors. Use this tensor desc for * bnScaleBiasMeanVarDesc and bnScaleBiasDiffDesc in Batch Normalization forward and backward functions.

func (*BatchNormD) ForwardInference ¶

func (b *BatchNormD) ForwardInference(
	handle *Handle,
	alpha, beta float64,
	xD *TensorD, x cutil.Mem,
	yD *TensorD, y cutil.Mem,
	ScaleBiasMeanVarDesc *TensorD, scale, bias, estimatedMean, estimatedVariance cutil.Mem,
	epsilon float64,

) error

ForwardInference info was pulled from cudnn documentation This function performs the forward BatchNormalization layer computation for inference phase. This layer is based on the paper "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift", S. Ioffe, C. Szegedy, 2015.

Notes:

1)Only 4D and 5D tensors are supported.

2)The input transformation performed by this function is defined as: y := alpha*y + beta *(bnScale * (x-estimatedMean)/sqrt(epsilon + estimatedVariance)+bnBias)

3)The epsilon value has to be the same during training, backpropagation and inference.

4)For training phase use cudnnBatchNormalizationForwardTraining.

5)Much higher performance when HW-packed tensors are used for all of x, dy, dx.

Parameters:

	----
	handle(input):

	Handle to a previously created cuDNN library descriptor.
	----
	---
	mode(input):

	Mode of operation (spatial or per-activation). BatchNormMode
	---
	----
	alpha, beta (input):

	Scaling factors in host mem y = alpha *result + beta *y
	----
	---
	xDesc (input), yDesc (input), x (input), y (output):

  Descriptors and pointers to mem
	---
	----
	bnScaleBiasMeanVarDesc, bnScaleData, bnBiasData(inputs):

	Tensor descriptor and pointers in device memory for
	the batch normalization scale and bias parameters
	----
	---
	estimatedMean, estimatedVariance (inputs):

	Mean and variance tensors (these have the same descriptor as the bias and scale).
	It is suggested that resultRunningMean, resultRunningVariance from the cudnnBatchNormalizationForwardTraining
	call accumulated during the training phase are passed as inputs here.
	---
	----
	epsilon(input):

	Epsilon value used in the batch normalization formula.
	Minimum allowed value is found in  MinEpsilon() method. (It is now zero)
	----

Returns:

nil - The computation was performed successfully.

CUDNN_STATUS_NOT_SUPPORTED - The function does not support the provided configuration.

CUDNN_STATUS_BAD_PARAM - At least one of the following conditions are met:

	1)One of the pointers alpha, beta, x, y, bnScaleData, bnBiasData, estimatedMean, estimatedInvVariance is NULL.
	2)Number of xDesc or yDesc tensor descriptor dimensions is not within the [4,5] range.
	3)bnScaleBiasMeanVarDesc dimensions are not 1xC(x1)x1x1 for spatial or 1xC(xD)xHxW for per-activation mode (parenthesis for 5D).
	4)epsilon value is less than CUDNN_BN_MIN_EPSILON
	5)Dimensions or data types mismatch for xDesc, yDesc

func (*BatchNormD) ForwardInferenceUS ¶

func (b *BatchNormD) ForwardInferenceUS(
	handle *Handle,
	alpha, beta float64,
	xD *TensorD, x unsafe.Pointer,
	yD *TensorD, y unsafe.Pointer,
	ScaleBiasMeanVarDesc *TensorD, scale, bias, estimatedMean, estimatedVariance unsafe.Pointer,
	epsilon float64,

) error

ForwardInferenceUS is like ForwardInference but uses unsafe.Pointers instead of cutil.Mems

func (*BatchNormD) ForwardTraining ¶

func (b *BatchNormD) ForwardTraining(
	handle *Handle,
	alpha float64,
	beta float64,
	xD *TensorD,
	x cutil.Mem,
	yD *TensorD,
	y cutil.Mem,

	bnScaleBiasMeanVar *TensorD,

	scale cutil.Mem,
	bias cutil.Mem,

	expAveFactor float64,

	resultrunningmean cutil.Mem,

	resultRunningVariance cutil.Mem,
	epsilon float64,
	resultSaveMean cutil.Mem,
	resultSaveInvVariance cutil.Mem,

) error

ForwardTraining from the documentation This function performs the forward BatchNormalization layer computation for training phase.

Notes:

1)Only 4D and 5D tensors are supported.

2)The epsilon value has to be the same during training, backpropagation and inference.

3)For inference phase use cudnnBatchNormalizationForwardInference.

4)Much higher performance for HW-packed tensors for both x and y.

Parameters:

----
handle:

Handle to a previously created cuDNN library descriptor.
----
---
alpha, beta (Inputs):

Scaling Factors y= alpha*opresult + beta*y
---
----
xD, yD, x, y:

Tensor descriptors and pointers in device memory for the layer's x and y data.
----
---
bnScaleBiasMeanVar:

Shared tensor descriptor desc for all the 6 tensors below in the argument list.
The dimensions for this tensor descriptor are dependent on the normalization mode.
---
----
scal, bias(Inputs):

Pointers in device memory for the batch normalization scale and bias parameters.
Note: Since bias isn't used during the backward pass.  You can use bias for other batchnorm layers.
----
---
expAveFactor (input):

Factor used in the moving average computation runningMean = newMean*factor + runningMean*(1-factor).
Use a factor=1/(1+n) at N-th call to the function to get Cumulative Moving Average (CMA) behavior CMA[n] = (x[1]+...+x[n])/n.
Since CMA[n+1] = (n*CMA[n]+x[n+1])/(n+1)= ((n+1)*CMA[n]-CMA[n])/(n+1) + x[n+1]/(n+1) = CMA[n]*(1-1/(n+1))+x[n+1]*1/(n+1)
---
----
resultRunningMean,resultRunningVariance (input/output):

Running mean and variance tensors (these have the same descriptor as the bias and scale).
Both of these pointers can be NULL but only at the same time.
The value stored in resultRunningVariance (or passed as an input in inference mode) is the moving average of variance[x]
where variance is computed either over batch or spatial+batch dimensions depending on the mode.
If these pointers are not NULL, the tensors should be initialized to some reasonable values or to 0.
----
---
epsilon:

Epsilon value used in the batch normalization formula. Minimum allowed value is CUDNN_BN_MIN_EPSILON defined in cudnn.h.
Same epsilon value should be used in forward and backward functions.
---
----
resultSaveMean, resultSaveInvVariance (outputs):

Optional cache to save intermediate results computed during the forward pass
these can then be reused to speed up the backward pass.
For this to work correctly, the bottom layer data has to remain unchanged until the backward function is called.
Note that both of these parameters can be NULL but only at the same time.
It is recommended to use this cache since memory overhead is relatively small because these tensors have a much lower product of dimensions than the data tensors.
----

Returns:

nil - The computation was performed successfully.

CUDNN_STATUS_NOT_SUPPORTED - The function does not support the provided configuration.

CUDNN_STATUS_BAD_PARAM - At least one of the following conditions are met:

	1)One of the pointers alpha, beta, x, y, bnScaleData, bnBiasData is NULL.
	2)Number of xDesc or yDesc tensor descriptor dimensions is not within the [4,5] range.
	3)bnScaleBiasMeanVarDesc dimensions are not 1xC(x1)x1x1 for spatial or 1xC(xD)xHxW for per-activation mode (parens for 5D).
	4)Exactly one of resultSaveMean, resultSaveInvVariance pointers is NULL.
	5)Exactly one of resultRunningMean, resultRunningInvVariance pointers is NULL.
	6)epsilon value is less than MinEpsilon()
	7)Dimensions or data types mismatch for xDesc, yDesc

func (*BatchNormD) ForwardTrainingUS ¶

func (b *BatchNormD) ForwardTrainingUS(
	handle *Handle,
	alpha float64,
	beta float64,
	xD *TensorD,
	x unsafe.Pointer,
	yD *TensorD,
	y unsafe.Pointer,

	bnScaleBiasMeanVar *TensorD,

	scale unsafe.Pointer,
	bias unsafe.Pointer,

	expAveFactor float64,

	resultrunningmean unsafe.Pointer,

	resultRunningVariance unsafe.Pointer,
	epsilon float64,
	resultSaveMean unsafe.Pointer,
	resultSaveInvVariance unsafe.Pointer,

) error

ForwardTrainingUS is just like ForwardTraining but uses unsafe.Pointers.

func (*BatchNormD) Get ¶

func (b *BatchNormD) Get() (mode BatchNormMode, err error)

Get gets the values stored in BatchNormMode

func (*BatchNormD) MinEpsilon ¶

func (b *BatchNormD) MinEpsilon() float64

MinEpsilon is the Minimum Epsilon required. It is now zero, but it used to be 1e-5

func (*BatchNormD) Set ¶

func (b *BatchNormD) Set(mode BatchNormMode) error

Set sets the values used in the batchnorm descriptor

func (*BatchNormD) String ¶

func (b *BatchNormD) String() string

type BatchNormDEx ¶

type BatchNormDEx struct {
	// contains filtered or unexported fields
}

BatchNormDEx is a gocudnn original. This is to make the batchnorm operation similar to the majority cudnn.

func CreateBatchNormDescriptorEx ¶

func CreateBatchNormDescriptorEx() *BatchNormDEx

CreateBatchNormDescriptorEx creates a new BatchNormDEx

func (*BatchNormDEx) Backward ¶

func (b *BatchNormDEx) Backward(
	h *Handle,
	alphadata, betadata, alphaparam, betaparam float64,
	xD *TensorD,
	x cutil.Mem,
	yD *TensorD,
	y cutil.Mem,
	dyD *TensorD,
	dy cutil.Mem,
	dzD *TensorD,
	dz cutil.Mem,
	dxD *TensorD,
	dx cutil.Mem,
	dbnScaleBiasMeanVarDesc *TensorD,
	scale cutil.Mem,
	bias cutil.Mem,
	dscale cutil.Mem,
	dbias cutil.Mem,
	epsilon float64,
	fromresultSaveMean cutil.Mem,
	fromreslutSaveInVariance cutil.Mem,
	actD *ActivationD,
	wspace cutil.Mem,
	wspacesib uint,
	rspace cutil.Mem,
	rspacesib uint,
) error

Backward does the backward ex algorithm.

func (*BatchNormDEx) BackwardUS ¶

func (b *BatchNormDEx) BackwardUS(
	h *Handle,
	alphadata, betadata, alphaparam, betaparam float64,
	xD *TensorD,
	x unsafe.Pointer,
	yD *TensorD,
	y unsafe.Pointer,
	dyD *TensorD,
	dy unsafe.Pointer,
	dzD *TensorD,
	dz unsafe.Pointer,
	dxD *TensorD,
	dx unsafe.Pointer,
	dbnScaleBiasMeanVarDesc *TensorD,
	scale unsafe.Pointer,
	bias unsafe.Pointer,
	dscale unsafe.Pointer,
	dbias unsafe.Pointer,
	epsilon float64,
	fromresultSaveMean unsafe.Pointer,
	fromreslutSaveInVariance unsafe.Pointer,
	actD *ActivationD,
	wspace unsafe.Pointer,
	wspacesib uint,
	rspace unsafe.Pointer,
	rspacesib uint,
) error

BackwardUS is just like Backward but uses unsafe.Pointers instead of cutil.Mem.

func (*BatchNormDEx) DeriveBNTensorDescriptor ¶

func (b *BatchNormDEx) DeriveBNTensorDescriptor(xDesc *TensorD) (bndesc *TensorD, err error)

DeriveBNTensorDescriptor derives a tensor used for the batch norm operation

func (*BatchNormDEx) ForwardInference ¶

func (b *BatchNormDEx) ForwardInference(
	handle *Handle,
	alpha, beta float64,
	xD *TensorD,
	x cutil.Mem,
	yD *TensorD,
	y cutil.Mem,
	ScaleBiasMeanVarDesc *TensorD,
	scale, bias, estimatedMean, estimatedVariance cutil.Mem,
	epsilon float64,

) error

ForwardInference info was pulled from cudnn documentation

This function performs the forward BatchNormalization layer computation for inference phase. This layer is based on the paper "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift", S. Ioffe, C. Szegedy, 2015. Note: Only 4D and 5D tensors are supported. Note: The input transformation performed by this function is defined as: y := alpha*y + beta *(bnScale * (x-estimatedMean)/sqrt(epsilon + estimatedVariance)+bnBias) Note: The epsilon value has to be the same during training, backpropagation and inference. Note: For training phase use cudnnBatchNormalizationForwardTraining. Note: Much higher performance when HW-packed tensors are used for all of x, dy, dx.

Parameters::

handle(input): Handle to a previously created cuDNN library descriptor.

mode(input): Mode of operation (spatial or per-activation). BatchNormMode

alpha, beta (input): Pointers to scaling factors (in host memory) used to blend the layer output value with prior value in the destination tensor as follows:

dstValue = alpha[0]*resultValue + beta[0]*priorDstValue. Please refer to this section for additional details.

xDesc, yDesc, x, y: Tensor descriptors and pointers in device memory for the layer's x and y data.

bnScaleBiasMeanVarDesc, bnScaleData, bnBiasData(inputs): Tensor descriptor and pointers in device memory for the batch normalization scale and bias parameters

(in the original paper bias is referred to as beta and scale as gamma).

estimatedMean, estimatedVariance (inputs): Mean and variance tensors (these have the same descriptor as the bias and scale).

It is suggested that resultRunningMean, resultRunningVariance from the cudnnBatchNormalizationForwardTraining
call accumulated during the training phase are passed as inputs here.

epsilon(input): Epsilon value used in the batch normalization formula. Minimum allowed value is CUDNN_BN_MIN_EPSILON defined in cudnn.h.

Possible error values returned by this function and their meanings are listed below.

Returns ¶

CUDNN_STATUS_SUCCESS

The computation was performed successfully.

CUDNN_STATUS_NOT_SUPPORTED

The function does not support the provided configuration.

CUDNN_STATUS_BAD_PARAM

At least one of the following conditions are met:

    One of the pointers alpha, beta, x, y, bnScaleData, bnBiasData, estimatedMean, estimatedInvVariance is NULL.
    Number of xDesc or yDesc tensor descriptor dimensions is not within the [4,5] range.
    bnScaleBiasMeanVarDesc dimensions are not 1xC(x1)x1x1 for spatial or 1xC(xD)xHxW for per-activation mode (parenthesis for 5D).
    epsilon value is less than CUDNN_BN_MIN_EPSILON
    Dimensions or data types mismatch for xDesc, yDesc

* Performs Batch Normalization during Inference: * y[i] = bnScale[k]*(x[i]-estimatedMean[k])/sqrt(epsilon+estimatedVariance[k]) + bnBias[k] * with bnScale, bnBias, runningMean, runningInvVariance tensors indexed * according to spatial or per-activation mode. Refer to cudnnBatchNormalizationForwardTraining * above for notes on function arguments.

func (*BatchNormDEx) ForwardInferenceUS ¶

func (b *BatchNormDEx) ForwardInferenceUS(
	handle *Handle,
	alpha, beta float64,
	xD *TensorD,
	x unsafe.Pointer,
	yD *TensorD,
	y unsafe.Pointer,
	ScaleBiasMeanVarDesc *TensorD,
	scale, bias, estimatedMean, estimatedVariance unsafe.Pointer,
	epsilon float64,

) error

ForwardInferenceUS is just like ForwardInference but uses unsafe.Pointers instead of cutil.Mem

func (*BatchNormDEx) ForwardTraining ¶

func (b *BatchNormDEx) ForwardTraining(
	h *Handle,
	alpha, beta float64,
	xD *TensorD,
	x cutil.Mem,
	zD *TensorD,
	z cutil.Mem,
	yD *TensorD,
	y cutil.Mem,
	bnScaleBiasMeanVarDesc *TensorD,
	scale cutil.Mem,
	bias cutil.Mem,
	expoAverageFactor float64,
	resultRunningMean cutil.Mem,
	resultRunningVariance cutil.Mem,
	epsilon float64,
	resultSaveMean cutil.Mem,
	reslutSaveInVariance cutil.Mem,
	actD *ActivationD,
	wspace cutil.Mem,
	wspacesib uint,
	rspace cutil.Mem,
	rspacesib uint,
) error

ForwardTraining does the forward training ex algorithm.

func (*BatchNormDEx) ForwardTrainingUS ¶

func (b *BatchNormDEx) ForwardTrainingUS(
	h *Handle,
	alpha, beta float64,
	xD *TensorD,
	x unsafe.Pointer,
	zD *TensorD,
	z unsafe.Pointer,
	yD *TensorD,
	y unsafe.Pointer,
	bnScaleBiasMeanVarDesc *TensorD,
	scale unsafe.Pointer,
	bias unsafe.Pointer,
	expoAverageFactor float64,
	resultRunningMean unsafe.Pointer,
	resultRunningVariance unsafe.Pointer,
	epsilon float64,
	resultSaveMean unsafe.Pointer,
	reslutSaveInVariance unsafe.Pointer,
	actD *ActivationD,
	wspace unsafe.Pointer,
	wspacesib uint,
	rspace unsafe.Pointer,
	rspacesib uint,
) error

ForwardTrainingUS is loke ForwardTraining but using unsafe.Pointers instead of cutil.Mems

func (*BatchNormDEx) GeBackwardWorkspaceSize ¶

func (b *BatchNormDEx) GeBackwardWorkspaceSize(
	h *Handle,
	xD, yD, dyD, dzD, dxD, dbnScaleBiasMeanVarDesc *TensorD,
	actD *ActivationD,
) (wspaceSIB uint, err error)

GeBackwardWorkspaceSize gets the workspace size in bytes for the backward operation

func (*BatchNormDEx) Get ¶

func (b *BatchNormDEx) Get() (mode BatchNormMode, op BatchNormOps, err error)

Get gets the BatchNormMode and BatchNormOps held in the descriptor

func (*BatchNormDEx) GetForwardTrainingWorkspaceSize ¶

func (b *BatchNormDEx) GetForwardTrainingWorkspaceSize(h *Handle,
	mode BatchNormMode,
	op BatchNormOps,
	xD, zD, yD, bnScaleBiasMeanVarDesc *TensorD,
	actD *ActivationD) (wspaceSIB uint, err error)

GetForwardTrainingWorkspaceSize gets the forward training ex workspacesize

func (*BatchNormDEx) GetTrainingReserveSpaceSize ¶

func (b *BatchNormDEx) GetTrainingReserveSpaceSize(h *Handle,
	actD *ActivationD,
	xD *TensorD,
) (rspaceSIB uint, err error)

GetTrainingReserveSpaceSize gets the reserve space size for ex operation

func (*BatchNormDEx) MinEpsilon ¶

func (b *BatchNormDEx) MinEpsilon() float64

MinEpsilon is the Minimum Epsilon required. It is now zero, but it used to be 1e-5

func (*BatchNormDEx) Set ¶

func (b *BatchNormDEx) Set(mode BatchNormMode, op BatchNormOps) error

Set sets the BatchNormMode and BatchNormOps held in the descriptor

func (*BatchNormDEx) String ¶

func (b *BatchNormDEx) String() string

type BatchNormMode ¶

type BatchNormMode C.cudnnBatchNormMode_t

BatchNormMode used for BatchNormMode Flags

func (*BatchNormMode) PerActivation ¶

func (b *BatchNormMode) PerActivation() BatchNormMode

PerActivation sets b to BatchNormMode(C.CUDNN_BATCHNORM_PER_ACTIVATION) and returns that new value Normalization is performed per-activation. This mode is intended to be used after the non-convolutional network layers. In this mode, the tensor dimensions of bnBias and bnScale and the parameters used in the cudnnBatchNormalization* functions, are 1xCxHxW.

func (*BatchNormMode) Spatial ¶

func (b *BatchNormMode) Spatial() BatchNormMode

Spatial sets b to BatchNormMode(C.CUDNN_BATCHNORM_SPATIAL) and returns that new value. Normalization is performed over N+spatial dimensions. This mode is intended for use after convolutional layers (where spatial invariance is desired). In this mode the bnBias and bnScale tensor dimensions are 1xCx1x1.

func (*BatchNormMode) SpatialPersistent ¶

func (b *BatchNormMode) SpatialPersistent() BatchNormMode

SpatialPersistent sets b to BatchNormMode(C.CUDNN_BATCHNORM_SPATIAL_PERSISTENT) and returns that new value This mode is similar to CUDNN_BATCHNORM_SPATIAL but it can be faster for some tasks.

func (BatchNormMode) String ¶

func (b BatchNormMode) String() string

type BatchNormOps ¶

type BatchNormOps C.cudnnBatchNormOps_t

BatchNormOps are flags for BatchNormOps when needed

func (*BatchNormOps) Activation ¶

func (b *BatchNormOps) Activation() BatchNormOps

Activation sets b to BatchNormOps(C.CUDNN_BATCHNORM_OPS_BN_ACTIVATION) /* do batchNorm, then activation */

func (*BatchNormOps) AddActivation ¶

func (b *BatchNormOps) AddActivation() BatchNormOps

AddActivation sets b to BatchNormOps(C.CUDNN_BATCHNORM_OPS_BN_ADD_ACTIVATION) /* do batchNorm, then elemWiseAdd, then activation */

func (*BatchNormOps) Normal ¶

func (b *BatchNormOps) Normal() BatchNormOps

Normal sets b to BatchNormOps(C.CUDNN_BATCHNORM_OPS_BN) and returns that new value /* do batch normalization only */

func (BatchNormOps) String ¶

func (b BatchNormOps) String() string

type CTCLossAlgo ¶

type CTCLossAlgo C.cudnnCTCLossAlgo_t

CTCLossAlgo used to hold flags

func (CTCLossAlgo) Algo ¶

func (c CTCLossAlgo) Algo() Algorithm

Algo returns an algo

func (*CTCLossAlgo) Deterministic ¶

func (c *CTCLossAlgo) Deterministic() CTCLossAlgo

Deterministic sets c to and returns CTCLossAlgo(C.CUDNN_CTC_LOSS_ALGO_DETERMINISTIC)

func (*CTCLossAlgo) NonDeterministic ¶

func (c *CTCLossAlgo) NonDeterministic() CTCLossAlgo

NonDeterministic sets c to and returns CTCLossAlgo(C.CUDNN_CTC_LOSS_ALGO_NON_DETERMINISTIC) Flag

func (CTCLossAlgo) String ¶

func (c CTCLossAlgo) String() string

type CTCLossD ¶

type CTCLossD struct {
	// contains filtered or unexported fields
}

CTCLossD holdes the C.cudnnCTCLossDescriptor_t

func CreateCTCLossDescriptor ¶

func CreateCTCLossDescriptor() (*CTCLossD, error)

CreateCTCLossDescriptor creates

func (*CTCLossD) CTCLoss ¶

func (c *CTCLossD) CTCLoss(
	handle *Handle,
	probsD *TensorD,
	probs cutil.Mem,
	labels []int32,
	labelLengths []int32,
	inputLengths []int32,
	costs cutil.Mem,
	gradientsD *TensorD,
	gradients cutil.Mem,
	algo CTCLossAlgo,
	wspace cutil.Mem,
	wspacesize uint,
) error

CTCLoss calculates loss

func (*CTCLossD) CTCLossUS ¶

func (c *CTCLossD) CTCLossUS(
	handle *Handle,
	probsD *TensorD, probs unsafe.Pointer,
	labels []int32,
	labelLengths []int32,
	inputLengths []int32,
	costs unsafe.Pointer,
	gradientsD *TensorD, gradients unsafe.Pointer,
	algo CTCLossAlgo,
	wspace unsafe.Pointer, wspacesize uint,
) error

CTCLossUS is like CTCLoss but uses unsafe.Pointer instead of cutil.Mem

func (*CTCLossD) Destroy ¶

func (c *CTCLossD) Destroy() error

Destroy destroys the descriptor inside CTCLossD if go's gc is not in use. if gc is being used destroy will just return nil

func (*CTCLossD) Get ¶

func (c *CTCLossD) Get() (DataType, error)

Get returns the datatype and error

func (*CTCLossD) GetWorkspaceSize ¶

func (c *CTCLossD) GetWorkspaceSize(
	handle *Handle,
	probsD *TensorD,
	gradientsD *TensorD,
	labels []int32,
	labelLengths []int32,
	inputLengths []int32,
	algo CTCLossAlgo,
) (uint, error)

GetWorkspaceSize calculates workspace size

func (*CTCLossD) Set ¶

func (c *CTCLossD) Set(data DataType) error

Set sets the CTCLossD

type ConvBwdDataAlgo ¶

type ConvBwdDataAlgo C.cudnnConvolutionBwdDataAlgo_t

ConvBwdDataAlgo used for flags in the bacward data algorithms exposing them through methods

func (ConvBwdDataAlgo) Algo ¶

func (c ConvBwdDataAlgo) Algo() Algorithm

Algo returns an Algorithm struct

func (*ConvBwdDataAlgo) Algo0 ¶

func (c *ConvBwdDataAlgo) Algo0() ConvBwdDataAlgo

Algo0 sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_0) and returns value of c /* non-deterministic */

func (*ConvBwdDataAlgo) Algo1 ¶

func (c *ConvBwdDataAlgo) Algo1() ConvBwdDataAlgo

Algo1 sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_1) and returns value of c

func (*ConvBwdDataAlgo) Count ¶

func (c *ConvBwdDataAlgo) Count() ConvBwdDataAlgo

Count sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_COUNT) and returns value of c

func (*ConvBwdDataAlgo) FFT ¶

func (c *ConvBwdDataAlgo) FFT() ConvBwdDataAlgo

FFT sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_FFT) and returns value of c

func (*ConvBwdDataAlgo) FFTTiling ¶

func (c *ConvBwdDataAlgo) FFTTiling() ConvBwdDataAlgo

FFTTiling sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_FFT_TILING) and returns value of c

func (ConvBwdDataAlgo) String ¶

func (c ConvBwdDataAlgo) String() string

func (*ConvBwdDataAlgo) Winograd ¶

func (c *ConvBwdDataAlgo) Winograd() ConvBwdDataAlgo

Winograd sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_WINOGRAD) and returns value of c

func (*ConvBwdDataAlgo) WinogradNonFused ¶

func (c *ConvBwdDataAlgo) WinogradNonFused() ConvBwdDataAlgo

WinogradNonFused sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_WINOGRAD_NONFUSED) and returns value of c

type ConvBwdDataAlgoPerformance ¶

type ConvBwdDataAlgoPerformance struct {
	Algo        ConvBwdDataAlgo `json:"algo,omitempty"`
	Status      Status          `json:"status,omitempty"`
	Time        float32         `json:"time,omitempty"`
	Memory      uint            `json:"memory,omitempty"`
	Determinism Determinism     `json:"determinism,omitempty"`
	MathType    MathType        `json:"math_type,omitempty"`
}

ConvBwdDataAlgoPerformance is the return struct in the finding algorithm funcs

func (ConvBwdDataAlgoPerformance) String ¶

func (cb ConvBwdDataAlgoPerformance) String() string

type ConvBwdDataPref ¶

type ConvBwdDataPref C.cudnnConvolutionBwdDataPreference_t

ConvBwdDataPref used for flags on bwddatapref exposing them through methods

func (*ConvBwdDataPref) NoWorkSpace ¶

func (c *ConvBwdDataPref) NoWorkSpace() ConvBwdDataPref

NoWorkSpace sets c to returns ConvBwdDataPref( C.CUDNN_CONVOLUTION_FWD_NO_WORKSPACE) and returns value of c

func (*ConvBwdDataPref) PreferFastest ¶

func (c *ConvBwdDataPref) PreferFastest() ConvBwdDataPref

PreferFastest sets c to ConvBwdDataPref( C.CUDNN_CONVOLUTION_FWD_PREFER_FASTEST) and returns value of c

func (*ConvBwdDataPref) SpecifyWorkSpaceLimit ¶

func (c *ConvBwdDataPref) SpecifyWorkSpaceLimit() ConvBwdDataPref

SpecifyWorkSpaceLimit sets c to ConvBwdDataPref( C.CUDNN_CONVOLUTION_FWD_SPECIFY_WORKSPACE_LIMIT)and returns value of c

type ConvBwdFiltAlgo ¶

type ConvBwdFiltAlgo C.cudnnConvolutionBwdFilterAlgo_t

ConvBwdFiltAlgo Used for ConvBwdFiltAlgo flags exposing them through methods

func (ConvBwdFiltAlgo) Algo ¶

func (c ConvBwdFiltAlgo) Algo() Algorithm

Algo returns an Algorithm Struct

func (*ConvBwdFiltAlgo) Algo0 ¶

func (c *ConvBwdFiltAlgo) Algo0() ConvBwdFiltAlgo

Algo0 sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_0) and returns value of c /* non-deterministic */

func (*ConvBwdFiltAlgo) Algo1 ¶

func (c *ConvBwdFiltAlgo) Algo1() ConvBwdFiltAlgo

Algo1 sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_1) and returns value of c

func (*ConvBwdFiltAlgo) Algo3 ¶

func (c *ConvBwdFiltAlgo) Algo3() ConvBwdFiltAlgo

Algo3 sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_3) and returns value of c

func (*ConvBwdFiltAlgo) Count ¶

func (c *ConvBwdFiltAlgo) Count() ConvBwdFiltAlgo

Count sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_COUNT) and returns value of c

func (*ConvBwdFiltAlgo) FFT ¶

func (c *ConvBwdFiltAlgo) FFT() ConvBwdFiltAlgo

FFT sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_FFT) and returns value of c

func (*ConvBwdFiltAlgo) FFTTiling ¶

func (c *ConvBwdFiltAlgo) FFTTiling() ConvBwdFiltAlgo

FFTTiling sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_FFT_TILING) and returns value of c

func (ConvBwdFiltAlgo) String ¶

func (c ConvBwdFiltAlgo) String() string

func (*ConvBwdFiltAlgo) Winograd ¶

func (c *ConvBwdFiltAlgo) Winograd() ConvBwdFiltAlgo

Winograd sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_WINOGRAD) and returns value of c

func (*ConvBwdFiltAlgo) WinogradNonFused ¶

func (c *ConvBwdFiltAlgo) WinogradNonFused() ConvBwdFiltAlgo

WinogradNonFused sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_WINOGRAD_NONFUSED) and returns value of c

type ConvBwdFiltAlgoPerformance ¶

type ConvBwdFiltAlgoPerformance struct {
	Algo        ConvBwdFiltAlgo `json:"algo,omitempty"`
	Status      Status          `json:"status,omitempty"`
	Time        float32         `json:"time,omitempty"`
	Memory      uint            `json:"memory,omitempty"`
	Determinism Determinism     `json:"determinism,omitempty"`
	MathType    MathType        `json:"math_type,omitempty"`
}

ConvBwdFiltAlgoPerformance is the return struct in the finding algorithm funcs

func (ConvBwdFiltAlgoPerformance) String ¶

func (cb ConvBwdFiltAlgoPerformance) String() string

type ConvBwdFilterPref ¶

type ConvBwdFilterPref C.cudnnConvolutionBwdFilterPreference_t

ConvBwdFilterPref are used for flags for the backwds filters exposing them through methods

func (*ConvBwdFilterPref) NoWorkSpace ¶

func (c *ConvBwdFilterPref) NoWorkSpace() ConvBwdFilterPref

NoWorkSpace sets c to ConvBwdFilterPref( C.CUDNN_CONVOLUTION_BWD_FILTER_NO_WORKSPACE) and returns value of c

func (*ConvBwdFilterPref) PreferFastest ¶

func (c *ConvBwdFilterPref) PreferFastest() ConvBwdFilterPref

PreferFastest sets c to ConvBwdFilterPref( C.CUDNN_CONVOLUTION_BWD_FILTER_PREFER_FASTEST) and returns value of c

func (*ConvBwdFilterPref) SpecifyWorkSpaceLimit ¶

func (c *ConvBwdFilterPref) SpecifyWorkSpaceLimit() ConvBwdFilterPref

SpecifyWorkSpaceLimit sets c to ConvBwdFilterPref( C.CUDNN_CONVOLUTION_BWD_FILTER_SPECIFY_WORKSPACE_LIMIT) and returns value of c

type ConvFwdAlgo ¶

type ConvFwdAlgo C.cudnnConvolutionFwdAlgo_t

ConvFwdAlgo flags for cudnnConvFwdAlgo_t exposing them through methods

func (ConvFwdAlgo) Algo ¶

func (c ConvFwdAlgo) Algo() Algorithm

Algo returns an Algorithm Struct

func (*ConvFwdAlgo) Count ¶

func (c *ConvFwdAlgo) Count() ConvFwdAlgo

Count sets c to ConvFwdAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_COUNT) and returns value of c

func (*ConvFwdAlgo) Direct ¶

func (c *ConvFwdAlgo) Direct() ConvFwdAlgo

Direct sets c to ConvFwdAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_DIRECT) and returns value of c

func (*ConvFwdAlgo) FFT ¶

func (c *ConvFwdAlgo) FFT() ConvFwdAlgo

FFT sets c to ConvFwdAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_FFT) and returns value of c

func (*ConvFwdAlgo) FFTTiling ¶

func (c *ConvFwdAlgo) FFTTiling() ConvFwdAlgo

FFTTiling sets c to ConvFwdAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_FFT_TILING) and returns value of c

func (*ConvFwdAlgo) Gemm ¶

func (c *ConvFwdAlgo) Gemm() ConvFwdAlgo

Gemm sets c to ConvFwdAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_GEMM) and returns value of c

func (*ConvFwdAlgo) ImplicitGemm ¶

func (c *ConvFwdAlgo) ImplicitGemm() ConvFwdAlgo

ImplicitGemm sets c to ConvFwdAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM) and returns value of c

func (*ConvFwdAlgo) ImplicitPrecompGemm ¶

func (c *ConvFwdAlgo) ImplicitPrecompGemm() ConvFwdAlgo

ImplicitPrecompGemm sets c to ConvFwdAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM) and returns value of c

func (ConvFwdAlgo) String ¶

func (c ConvFwdAlgo) String() string

func (*ConvFwdAlgo) WinoGrad ¶

func (c *ConvFwdAlgo) WinoGrad() ConvFwdAlgo

WinoGrad sets c to ConvFwdAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD) and returns value of c

func (*ConvFwdAlgo) WinoGradNonFused ¶

func (c *ConvFwdAlgo) WinoGradNonFused() ConvFwdAlgo

WinoGradNonFused sets c to ConvFwdAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD_NONFUSED) and returns value of c

type ConvFwdAlgoPerformance ¶

type ConvFwdAlgoPerformance struct {
	Algo        ConvFwdAlgo `json:"algo,omitempty"`
	Status      Status      `json:"status,omitempty"`
	Time        float32     `json:"time,omitempty"`
	Memory      uint        `json:"memory,omitempty"`
	Determinism Determinism `json:"determinism,omitempty"`
	MathType    MathType    `json:"math_type,omitempty"`
}

ConvFwdAlgoPerformance is a struct that holds the performance of the algorithm

func (ConvFwdAlgoPerformance) String ¶

func (cb ConvFwdAlgoPerformance) String() string

type ConvolutionD ¶

type ConvolutionD struct {
	// contains filtered or unexported fields
}

ConvolutionD sets all the convolution info

func CreateConvolutionDescriptor ¶

func CreateConvolutionDescriptor() (*ConvolutionD, error)

CreateConvolutionDescriptor creates a convolution descriptor

func (*ConvolutionD) BackwardBias ¶

func (c *ConvolutionD) BackwardBias(
	handle *Handle,
	alpha float64,
	dyD *TensorD,
	dy cutil.Mem,
	beta float64,
	dbD *TensorD,
	db cutil.Mem) error

BackwardBias is used to compute the bias gradient for batch convolution db is returned

func (*ConvolutionD) BackwardBiasUS ¶

func (c *ConvolutionD) BackwardBiasUS(
	handle *Handle,
	alpha float64,
	dyD *TensorD, dy unsafe.Pointer,
	beta float64,
	dbD *TensorD, db unsafe.Pointer) error

BackwardBiasUS is like BackwardBias but using unsafe.Pointer instead of cutil.Mem

func (*ConvolutionD) BackwardData ¶

func (c *ConvolutionD) BackwardData(
	handle *Handle,
	alpha float64,
	wD *FilterD, w cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	algo ConvBwdDataAlgo,
	wspace cutil.Mem, wspaceSIB uint,
	beta float64,
	dxD *TensorD, dx cutil.Mem,
) error

BackwardData does the backwards convolution on data

This function computes the convolution data gradient of the tensor dy, where y is the output of the forward convolution in (*ConvolutionD)Forward(). It uses the specified algo, and returns the results in the output tensor dx. Scaling factors alpha and beta can be used to scale the computed result or accumulate with the current dx.

Parameters:

---
handle(input):

previously created Handle
---
----
alpha, beta(input):

Pointers to scaling factors (in host memory) used to blend the computation result with prior
value in the output layer as follows: dstValue = alpha[0]*result + beta[0]*priorDstValue.
----
---
wD(input):

For previously set input tensor descriptor.
---
----
w(input):

Data pointer to GPU memory associated with the tensor descriptor xD.

----
---
dyD(input):

For previously set input tensor descriptor of dy.
---
----
dy(input):

Data pointer to GPU memory associated with the input tensor desctiptor.(Holds back propigation errors)
----
---
algo(input):

Enumerant that specifies which backward data convolution algorithm shoud be used to compute the results.
---
----
wspace, wspaceSIB(inputs):

Data pointer and size in bytes of workspace needed for algo passed. If no wspace is need nil can be passed.
----
---
dxD(input):
For previously set output tensor descriptor of dx.
---
----
dx(input/output):
Data pointer to GPU memory associated with the output tensor desctiptor.(Holds back propigation errors for layer it received its forward inputs.)
----

Supported Configurations

----
Config: "TRUE_HALF_CONFIG (only compute capability 5.3 and later)."
TensorD (wD,dyD,dxD): (*DataType)Half()
ConvolutionD: (*DataType)Half()
----
---
Config: "PSEUDO_HALF_CONFIG"
TensorD (wD,dyD,dxD): (*DataType)Half()
ConvolutionD: (*DataType)Float()
---
----
Config: "FLOAT_CONFIG"
TensorD (wD,dyD,dxD): (*DataType)Float()
ConvolutionD: (*DataType)Float()
----
---
Config: "DOUBLE_CONFIG"
TensorD (wD,dyD,dxD): (*DataType)Double()
ConvolutionD: (*DataType)Double()
---

Note: Specifying a separate algorithm can cause changes in performance, support and computation determinism.

Table of algorithm with configs can be found at. (gocudnn flag names are similar to cudnn)

https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html#cudnnConvolutionBackwardData

Possible Error Returns:

nil:

The function launched successfully.

CUDNN_STATUS_NOT_SUPPORTED:

At least one of the following conditions are met:
1)	dyD or dxD have negative tensor striding
2)	dyD, wD or dxD has a number of dimensions that is not 4 or 5
3)	The chosen algo does not support the parameters provided; see above for exhaustive list of parameter support for each algo
4)	dyD or wD indicate an output channel count that isn't a multiple of group count (if group count has been set in ConvolutionD).

CUDNN_STATUS_BAD_PARAM:

At least one of the following conditions are met:
1)	At least one of the following is NULL: handle, dyD, wD, ConvolutionD, dxD, dy, w, dx, alpha, beta
2)	wD and dyD have a non-matching number of dimensions
3)	wD and dxD have a non-matching number of dimensions
4)	wD has fewer than three number of dimensions
5)	wD, dxD and dyD have a non-matching data type.
6)	wD and dxD have a non-matching number of input feature maps per image (or group in case of Grouped Convolutions).
7)	dyD's spatial sizes do not match with the expected size as determined by (*ConvolutionD)GetOutputDims().

CUDNN_STATUS_MAPPING_ERROR:

An error occurs during the texture binding of the filter data or the input differential tensor data

CUDNN_STATUS_EXECUTION_FAILED:

The function failed to launch on the GPU.

func (*ConvolutionD) BackwardDataUS ¶

func (c *ConvolutionD) BackwardDataUS(
	handle *Handle,
	alpha float64,
	wD *FilterD, w unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	algo ConvBwdDataAlgo,
	wspace unsafe.Pointer, wspacesize uint,
	beta float64,
	dxD *TensorD, dx unsafe.Pointer,
) error

BackwardDataUS is like BackwardData but uses unsafe.Pointer instead of cutil.Mem

func (*ConvolutionD) BackwardFilter ¶

func (c *ConvolutionD) BackwardFilter(
	handle *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	algo ConvBwdFiltAlgo,
	wspace cutil.Mem, wspacesize uint,
	beta float64,
	dwD *FilterD, dw cutil.Mem,
) error

BackwardFilter does the backwards convolution

func (*ConvolutionD) BackwardFilterUS ¶

func (c *ConvolutionD) BackwardFilterUS(
	handle *Handle,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	algo ConvBwdFiltAlgo,
	wspace unsafe.Pointer, wspacesize uint,
	beta float64,
	dwD *FilterD, dw unsafe.Pointer,
) error

BackwardFilterUS is like BackwardFilter but using unsafe.Pointer instead of cutil.Mem

func (*ConvolutionD) BiasActivationForward ¶

func (c *ConvolutionD) BiasActivationForward(
	handle *Handle,
	alpha1 float64,
	xD *TensorD, x cutil.Mem,
	wD *FilterD, w cutil.Mem,
	algo ConvFwdAlgo,
	wspace cutil.Mem,
	wspacesize uint,
	alpha2 float64,
	zD *TensorD, z cutil.Mem,
	biasD *TensorD, bias cutil.Mem,
	aD *ActivationD,
	yD *TensorD, y cutil.Mem,
) error

BiasActivationForward info can be found at:

https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html#cudnnConvolutionBiasActivationForward

Fused conv/bias/activation operation : y = Act( alpha1 * conv(x) + alpha2 * z + bias )

func (*ConvolutionD) BiasActivationForwardUS ¶

func (c *ConvolutionD) BiasActivationForwardUS(
	handle *Handle,
	alpha1 float64,
	xD *TensorD, x unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	algo ConvFwdAlgo,
	wspace unsafe.Pointer, wspacesize uint,
	alpha2 float64,
	zD *TensorD, z unsafe.Pointer,
	biasD *TensorD, bias unsafe.Pointer,
	aD *ActivationD,
	yD *TensorD, y unsafe.Pointer,
) error

BiasActivationForwardUS is like BiasActivationForward but using unsafe.Pointer instead of cutil.Mem

func (*ConvolutionD) Destroy ¶

func (c *ConvolutionD) Destroy() error

Destroy destroys the ConvolutionDescriptor. If GC is set then it only returns nil. Currently GC is set with no option to turn off

func (*ConvolutionD) FindBackwardDataAlgorithm ¶

func (c *ConvolutionD) FindBackwardDataAlgorithm(
	handle *Handle,
	w *FilterD,
	dy *TensorD,
	dx *TensorD,
) ([]ConvBwdDataAlgoPerformance, error)

FindBackwardDataAlgorithm will find the top performing algoriths and return the best algorithms in accending order.

func (*ConvolutionD) FindBackwardDataAlgorithmEx ¶

func (c *ConvolutionD) FindBackwardDataAlgorithmEx(
	handle *Handle,
	wD *FilterD, w cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	dxD *TensorD, dx cutil.Mem,
	wspace cutil.Mem, wspacesize uint) ([]ConvBwdDataAlgoPerformance, error)

FindBackwardDataAlgorithmEx finds some algorithms with memory

func (*ConvolutionD) FindBackwardDataAlgorithmExUS ¶

func (c *ConvolutionD) FindBackwardDataAlgorithmExUS(
	handle *Handle,
	wD *FilterD, w unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	dxD *TensorD, dx unsafe.Pointer,
	wspace unsafe.Pointer, wspacesize uint) ([]ConvBwdDataAlgoPerformance, error)

FindBackwardDataAlgorithmExUS is just like FindBackwardDataAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*ConvolutionD) FindBackwardFilterAlgorithm ¶

func (c *ConvolutionD) FindBackwardFilterAlgorithm(
	handle *Handle,
	xD *TensorD,
	dyD *TensorD,
	dwD *FilterD,
) ([]ConvBwdFiltAlgoPerformance, error)

FindBackwardFilterAlgorithm will find the top performing algoriths and return the best algorithms in accending order they are limited to the number passed in requestedAlgoCount. So if 4 is passed through in requestedAlgoCount, then it will return the top 4 performers in the ConvolutionFwdAlgoPerformance struct. using this could possible give the user cheat level performance :-)

func (*ConvolutionD) FindBackwardFilterAlgorithmEx ¶

func (c *ConvolutionD) FindBackwardFilterAlgorithmEx(
	handle *Handle,
	xD *TensorD, x cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	dwD *FilterD, dw cutil.Mem,
	wspace cutil.Mem, wspacesize uint) ([]ConvBwdFiltAlgoPerformance, error)

FindBackwardFilterAlgorithmEx finds some algorithms with memory

func (*ConvolutionD) FindBackwardFilterAlgorithmExUS ¶

func (c *ConvolutionD) FindBackwardFilterAlgorithmExUS(
	handle *Handle,
	xD *TensorD, x unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	dwD *FilterD, dw unsafe.Pointer,
	wspace unsafe.Pointer, wspacesize uint) ([]ConvBwdFiltAlgoPerformance, error)

FindBackwardFilterAlgorithmExUS is just like FindBackwardFilterAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*ConvolutionD) FindForwardAlgorithm ¶

func (c *ConvolutionD) FindForwardAlgorithm(
	handle *Handle,
	xD *TensorD,
	wD *FilterD,
	yD *TensorD,
) ([]ConvFwdAlgoPerformance, error)

FindForwardAlgorithm will find the top performing algoriths and return the best algorithms in accending order they are limited to the number passed in requestedAlgoCount. So if 4 is passed through in requestedAlgoCount, then it will return the top 4 performers in the ConvFwdAlgoPerformance struct. using this could possible give the user cheat level performance :-)

func (*ConvolutionD) FindForwardAlgorithmEx ¶

func (c *ConvolutionD) FindForwardAlgorithmEx(
	handle *Handle,
	xD *TensorD, x cutil.Mem,
	wD *FilterD, w cutil.Mem,
	yD *TensorD, y cutil.Mem,
	wspace cutil.Mem, wspacesize uint) ([]ConvFwdAlgoPerformance, error)

FindForwardAlgorithmEx finds some algorithms with memory

func (*ConvolutionD) FindForwardAlgorithmExUS ¶

func (c *ConvolutionD) FindForwardAlgorithmExUS(
	handle *Handle,
	xD *TensorD,
	x unsafe.Pointer,
	wD *FilterD,
	w unsafe.Pointer,
	yD *TensorD,
	y unsafe.Pointer,
	wspace unsafe.Pointer,
	wspacesize uint) ([]ConvFwdAlgoPerformance, error)

FindForwardAlgorithmExUS is like FindForwardAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*ConvolutionD) Forward ¶

func (c *ConvolutionD) Forward(
	handle *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	wD *FilterD, w cutil.Mem,
	algo ConvFwdAlgo,
	wspace cutil.Mem, wspacesize uint,
	beta float64,
	yD *TensorD, y cutil.Mem) error

Forward Function to perform the forward pass for batch convolution

func (*ConvolutionD) ForwardUS ¶

func (c *ConvolutionD) ForwardUS(
	handle *Handle,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	algo ConvFwdAlgo,
	wspace unsafe.Pointer, wspacesize uint,
	beta float64,
	yD *TensorD, y unsafe.Pointer) error

ForwardUS is like Forward but using unsafe.Pointer instead of cutil.Mem

func (*ConvolutionD) Get ¶

func (c *ConvolutionD) Get() (mode ConvolutionMode, data DataType, pad []int32, stride []int32, dilation []int32, err error)

Get gets returns the values used to make the convolution descriptor

func (*ConvolutionD) GetBackwardDataAlgorithm ¶

func (c *ConvolutionD) GetBackwardDataAlgorithm(
	handle *Handle,
	wD *FilterD,
	dyD *TensorD,
	dxD *TensorD,
	pref ConvBwdDataPref, wspaceSIBlimit uint) (ConvBwdDataAlgo, error)

GetBackwardDataAlgorithm - This function serves as a heuristic for obtaining the best suited algorithm for (*ConvolutionD)BackwardData() for the given layer specifications. Based on the input preference, this function will either return the fastest algorithm or the fastest algorithm within a given memory limit. For an exhaustive search for the fastest algorithm, please use (*ConvolutionD)FindBackwardDataAlgorithm().

Parameters:

----
handle(input):
Handle to a previously created cuDNN context.
----
---
wD(input):
Handle to a previously initialized filter descriptor
---
----
dyD(input):
Handle to the previously initialized input differential tensor descriptor.
----
---
dxD(input):
Handle to the previously initialized output tensor descriptor.
---
----
pref(input):
Enumerant to express the preference criteria in terms of memory requirement and speed.
----
---
wspaceSIBlimit(input):
It is to specify the maximum amount of GPU memory the user is willing to use as a workspace.
This is currently a placeholder and is not used
---
----
returns:
ConvBwdDataAlgo and error.
----

Possible Error Returns:

nil:

The function launched successfully.

CUDNN_STATUS_BAD_PARAM:

At least one of these conditions are met:
1) The numbers of feature maps of the input tensor and output tensor differ.
2) The DataType of the tensor descriptors or the filter are different.

func (*ConvolutionD) GetBackwardDataAlgorithmV7 ¶

func (c *ConvolutionD) GetBackwardDataAlgorithmV7(
	handle *Handle,
	wD *FilterD,
	dyD *TensorD,
	dxD *TensorD,
) ([]ConvBwdDataAlgoPerformance, error)

GetBackwardDataAlgorithmV7 - This function serves as a heuristic for obtaining the best suited algorithm for cudnnConvolutionBackwardData for the given layer specifications. This function will return all algorithms (including (MathType where available) sorted by expected (based on internal heuristic) relative performance with fastest being index 0 of perfResults. For an exhaustive search for the fastest algorithm, please use (*ConvolutionD)FindBackwardDataAlgorithm().

func (*ConvolutionD) GetBackwardDataWorkspaceSize ¶

func (c *ConvolutionD) GetBackwardDataWorkspaceSize(
	handle *Handle,
	wD *FilterD,
	dyD *TensorD,
	dxD *TensorD,
	algo ConvBwdDataAlgo) (uint, error)

GetBackwardDataWorkspaceSize is a helper function that will return the minimum Size of the workspace to be passed by the convolution given an algo.

func (*ConvolutionD) GetBackwardFilterAlgorithm ¶

func (c *ConvolutionD) GetBackwardFilterAlgorithm(
	handle *Handle,
	xD *TensorD,
	dyD *TensorD,
	dwD *FilterD,
	pref ConvBwdFilterPref, wsmemlimit uint) (ConvBwdFiltAlgo, error)

GetBackwardFilterAlgorithm gives a good algo with the limits given to it

func (*ConvolutionD) GetBackwardFilterAlgorithmV7 ¶

func (c *ConvolutionD) GetBackwardFilterAlgorithmV7(
	handle *Handle,
	xD *TensorD,
	dyD *TensorD,
	dwD *FilterD,
) ([]ConvBwdFiltAlgoPerformance, error)

GetBackwardFilterAlgorithmV7 will find the top performing algoriths and return the best algorithms in accending order they are limited to the number passed in requestedAlgoCount. So if 4 is passed through in requestedAlgoCount, then it will return the top 4 performers in the ConvolutionFwdAlgoPerformance struct. using this could possible give the user cheat level performance :-)

func (*ConvolutionD) GetBackwardFilterWorkspaceSize ¶

func (c *ConvolutionD) GetBackwardFilterWorkspaceSize(
	handle *Handle,
	xD *TensorD,
	dyD *TensorD,
	dwD *FilterD,
	algo ConvBwdFiltAlgo) (uint, error)

GetBackwardFilterWorkspaceSize is a helper function that will return the minimum Size of the workspace to be passed by the convolution given an algo.

func (*ConvolutionD) GetForwardAlgorithm ¶

func (c *ConvolutionD) GetForwardAlgorithm(
	handle *Handle,
	xD *TensorD,
	wD *FilterD,
	yD *TensorD,
	pref ConvolutionForwardPref,
	wsmemlimit uint) (ConvFwdAlgo, error)

GetForwardAlgorithm gives a good algo with the limits given to it

func (*ConvolutionD) GetForwardAlgorithmV7 ¶

func (c *ConvolutionD) GetForwardAlgorithmV7(
	handle *Handle,
	xD *TensorD,
	wD *FilterD,
	yD *TensorD,
) ([]ConvFwdAlgoPerformance, error)

GetForwardAlgorithmV7 will find the top performing algoriths and return the best algorithms in accending order they are limited to the number passed in requestedAlgoCount. So if 4 is passed through in requestedAlgoCount, then it will return the top 4 performers in the ConvFwdAlgoPerformance struct. using this could possible give the user cheat level performance :-)

func (*ConvolutionD) GetForwardWorkspaceSize ¶

func (c *ConvolutionD) GetForwardWorkspaceSize(
	handle *Handle,
	xD *TensorD,
	wD *FilterD,
	yD *TensorD,
	algo ConvFwdAlgo) (uint, error)

GetForwardWorkspaceSize is a helper function that will return the minimum Size of the workspace to be passed by the convolution given an algo.

func (*ConvolutionD) GetOutputDims ¶

func (c *ConvolutionD) GetOutputDims(input *TensorD, filter *FilterD) ([]int32, error)

GetOutputDims is a helper function to give the size of the output of of a COnvolutionNDForward Each dimension of the (nbDims-2)-D images of the output tensor is computed as followed:

   outputDim = 1 + ( inputDim + 2*pad - (((filterDim-1)*dilation)+1) )/convolutionStride;

	Note if input and filter are NHWC.  cudnn would take the formats as NCHW and output an NCHW
 gocudnn will take that NCHW and format it to an actual NHWC.

func (*ConvolutionD) GetReorderType ¶

func (c *ConvolutionD) GetReorderType() (r Reorder, err error)

GetReorderType gets the reorder type

func (*ConvolutionD) Im2Col ¶

func (c *ConvolutionD) Im2Col(
	handle *Handle,
	xD *TensorD,
	x cutil.Mem,
	wD *FilterD,
	buffer cutil.Mem,
) error

Im2Col transformes the multiDim tensors into 2d tensors for speed up in calculation at the cost of memory.

func (*ConvolutionD) Im2ColUS ¶

func (c *ConvolutionD) Im2ColUS(
	handle *Handle,
	xD *TensorD, x unsafe.Pointer,
	wD *FilterD,
	buffer unsafe.Pointer,
) error

Im2ColUS is like IN2Col but using unsafe.Pointer instead of cutil.Mem

func (*ConvolutionD) Set ¶

func (c *ConvolutionD) Set(mode ConvolutionMode, data DataType, pad, stride, dilation []int32) error

Set sets the convolution descriptor Input.Type of the filter layout format. If this input is set to CUDNN_TENSOR_NCHW, which is one of the enumerated values allowed by cudnnTensorFormat_t descriptor, then the layout of the filter is as follows:

	For N=4, i.e., for a 4D filter descriptor, the filter layout is in the form of KCRS (K represents the number of output feature maps, C the number of input feature maps, R the number of rows per filter, and S the number of columns per filter.)

	For N=3, i.e., for a 3D filter descriptor, the number S (number of columns per filter) is omitted.

	For N=5 and greater, the layout of the higher dimensions immediately follow RS.

	On the other hand, if this input is set to CUDNN_TENSOR_NHWC, then the layout of the filter is as follows:

	for N=4, i.e., for a 4D filter descriptor, the filter layout is in the form of KRSC.

	For N=3, i.e., for a 3D filter descriptor, the number S (number of columns per filter) is omitted, and the layout of C immediately follows R.

	For N=5 and greater, the layout of the higher dimensions are inserted between S and C. See also the description for cudnnTensorFormat_t.

	Note:

 Length of stride, pad, and dilation need to be len(tensordims) -2.

func (*ConvolutionD) SetGroupCount ¶

func (c *ConvolutionD) SetGroupCount(groupCount int32) error

SetGroupCount sets the Group Count

func (*ConvolutionD) SetMathType ¶

func (c *ConvolutionD) SetMathType(mathtype MathType) error

SetMathType sets the mathtype

func (*ConvolutionD) SetReorderType ¶

func (c *ConvolutionD) SetReorderType(r Reorder) error

SetReorderType sets the reorder type

func (*ConvolutionD) String ¶

func (c *ConvolutionD) String() string

type ConvolutionForwardPref ¶

type ConvolutionForwardPref C.cudnnConvolutionFwdPreference_t

ConvolutionForwardPref used for flags exposing them through methods

func (*ConvolutionForwardPref) NoWorkSpace ¶

func (c *ConvolutionForwardPref) NoWorkSpace() ConvolutionForwardPref

NoWorkSpace sets c to ConvolutionForwardPref( C.CUDNN_CONVOLUTION_FWD_NO_WORKSPACE) and returns value of c

func (*ConvolutionForwardPref) PreferFastest ¶

func (c *ConvolutionForwardPref) PreferFastest() ConvolutionForwardPref

PreferFastest returns ConvolutionForwardPref( C.CUDNN_CONVOLUTION_FWD_PREFER_FASTEST)

func (*ConvolutionForwardPref) SpecifyWorkSpaceLimit ¶

func (c *ConvolutionForwardPref) SpecifyWorkSpaceLimit() ConvolutionForwardPref

SpecifyWorkSpaceLimit returns ConvolutionForwardPref( C.CUDNN_CONVOLUTION_FWD_SPECIFY_WORKSPACE_LIMIT)

type ConvolutionMode ¶

type ConvolutionMode C.cudnnConvolutionMode_t

ConvolutionMode is the type to describe the convolution mode flags

func (*ConvolutionMode) Convolution ¶

func (c *ConvolutionMode) Convolution() ConvolutionMode

Convolution sets and returns value of c to ConvolutionMode(C.CUDNN_CONVOLUTION)

func (*ConvolutionMode) CrossCorrelation ¶

func (c *ConvolutionMode) CrossCorrelation() ConvolutionMode

CrossCorrelation n sets and returns value of c to ConvolutionMode(C.CUDNN_CROSS_CORRELATION)

func (ConvolutionMode) String ¶

func (c ConvolutionMode) String() string

type DataType ¶

type DataType C.cudnnDataType_t

DataType is used for flags for the tensor layer structs

func (*DataType) Double ¶

func (d *DataType) Double() DataType

Double sets d to DataType(C.CUDNN_DATA_DOUBLE) and returns the changed value

func (*DataType) Float ¶

func (d *DataType) Float() DataType

Float sets d to DataType(C.CUDNN_DATA_FLOAT) and returns the changed value

func (*DataType) Half ¶

func (d *DataType) Half() DataType

Half sets d to DataType(C.CUDNN_DATA_HALF) and returns the changed value

func (*DataType) Int32 ¶

func (d *DataType) Int32() DataType

Int32 sets d to DataType(C.CUDNN_DATA_INT32) and returns the changed value

func (*DataType) Int8 ¶

func (d *DataType) Int8() DataType

Int8 sets d to DataType(C.CUDNN_DATA_INT8) and returns the changed value

func (*DataType) Int8x32 ¶

func (d *DataType) Int8x32() DataType

Int8x32 sets d to DataType(C.CUDNN_DATA_INT8x32) and returns the changed value -- only supported by sm_72?.

func (*DataType) Int8x4 ¶

func (d *DataType) Int8x4() DataType

Int8x4 sets d to DataType(C.CUDNN_DATA_INT8x4) and returns the changed value -- only supported by sm_72?.

func (DataType) String ¶

func (d DataType) String() string

ToString will return a human readable string that can be printed for debugging.

func (*DataType) UInt8 ¶

func (d *DataType) UInt8() DataType

UInt8 sets d to DataType(C.CUDNN_DATA_INT8) and returns the changed value

func (*DataType) UInt8x4 ¶

func (d *DataType) UInt8x4() DataType

UInt8x4 sets d to DataType(C.CUDNN_DATA_UINT8x4) and returns the changed value -- only supported by sm_72?.

type DeConvBwdDataAlgo ¶

type DeConvBwdDataAlgo C.cudnnConvolutionFwdAlgo_t

DeConvBwdDataAlgo flags for cudnnConvFwdAlgo_t exposing them through methods. Deconvolution uses the forward pass for backward data

func (DeConvBwdDataAlgo) Algo ¶

func (c DeConvBwdDataAlgo) Algo() Algorithm

Algo returns an Algorithm struct

func (*DeConvBwdDataAlgo) Count ¶

func (c *DeConvBwdDataAlgo) Count() DeConvBwdDataAlgo

Count sets c to DeConvBwdDataAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_COUNT) and returns value of c

func (*DeConvBwdDataAlgo) Direct ¶

func (c *DeConvBwdDataAlgo) Direct() DeConvBwdDataAlgo

Direct sets c to DeConvBwdDataAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_DIRECT) and returns value of c

func (*DeConvBwdDataAlgo) FFT ¶

func (c *DeConvBwdDataAlgo) FFT() DeConvBwdDataAlgo

FFT sets c to DeConvBwdDataAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_FFT) and returns value of c

func (*DeConvBwdDataAlgo) FFTTiling ¶

func (c *DeConvBwdDataAlgo) FFTTiling() DeConvBwdDataAlgo

FFTTiling sets c to DeConvBwdDataAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_FFT_TILING) and returns value of c

func (*DeConvBwdDataAlgo) Gemm ¶

func (c *DeConvBwdDataAlgo) Gemm() DeConvBwdDataAlgo

Gemm sets c to DeConvBwdDataAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_GEMM) and returns value of c

func (*DeConvBwdDataAlgo) ImplicitGemm ¶

func (c *DeConvBwdDataAlgo) ImplicitGemm() DeConvBwdDataAlgo

ImplicitGemm sets c to DeConvBwdDataAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM) and returns value of c

func (*DeConvBwdDataAlgo) ImplicitPrecompGemm ¶

func (c *DeConvBwdDataAlgo) ImplicitPrecompGemm() DeConvBwdDataAlgo

ImplicitPrecompGemm sets c to DeConvBwdDataAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM) and returns value of c

func (DeConvBwdDataAlgo) String ¶

func (c DeConvBwdDataAlgo) String() string

func (*DeConvBwdDataAlgo) WinoGrad ¶

func (c *DeConvBwdDataAlgo) WinoGrad() DeConvBwdDataAlgo

WinoGrad sets c to DeConvBwdDataAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD) and returns value of c

func (*DeConvBwdDataAlgo) WinoGradNonFused ¶

func (c *DeConvBwdDataAlgo) WinoGradNonFused() DeConvBwdDataAlgo

WinoGradNonFused sets c to DeConvBwdDataAlgo( C.CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD_NONFUSED) and returns value of c

type DeConvBwdDataAlgoPerformance ¶

type DeConvBwdDataAlgoPerformance struct {
	Algo        DeConvBwdDataAlgo `json:"algo,omitempty"`
	Status      Status            `json:"status,omitempty"`
	Time        float32           `json:"time,omitempty"`
	Memory      uint              `json:"memory,omitempty"`
	Determinism Determinism       `json:"determinism,omitempty"`
	MathType    MathType          `json:"math_type,omitempty"`
}

DeConvBwdDataAlgoPerformance is a new stuct that is made for deconvolution performance

func (DeConvBwdDataAlgoPerformance) String ¶

func (cb DeConvBwdDataAlgoPerformance) String() string

type DeConvBwdDataPref ¶

type DeConvBwdDataPref C.cudnnConvolutionFwdPreference_t

DeConvBwdDataPref used for flags on bwddatapref exposing them through methods

func (*DeConvBwdDataPref) NoWorkSpace ¶

func (c *DeConvBwdDataPref) NoWorkSpace() DeConvBwdDataPref

NoWorkSpace sets c to returns ConvBwdDataPref( C.CUDNN_CONVOLUTION_FWD_NO_WORKSPACE) and returns value of c

func (*DeConvBwdDataPref) PreferFastest ¶

func (c *DeConvBwdDataPref) PreferFastest() DeConvBwdDataPref

PreferFastest sets c to ConvBwdDataPref( C.CUDNN_CONVOLUTION_FWD_PREFER_FASTEST) and returns value of c

func (*DeConvBwdDataPref) SpecifyWorkSpaceLimit ¶

func (c *DeConvBwdDataPref) SpecifyWorkSpaceLimit() DeConvBwdDataPref

SpecifyWorkSpaceLimit sets c to ConvBwdDataPref( C.CUDNN_CONVOLUTION_FWD_SPECIFY_WORKSPACE_LIMIT)and returns value of c

type DeConvBwdFiltAlgo ¶

type DeConvBwdFiltAlgo C.cudnnConvolutionBwdFilterAlgo_t

DeConvBwdFiltAlgo Used for ConvBwdFiltAlgo flags exposing them through methods

func (DeConvBwdFiltAlgo) Algo ¶

func (c DeConvBwdFiltAlgo) Algo() Algorithm

Algo returns an Algorithm Struct

func (*DeConvBwdFiltAlgo) Algo0 ¶

func (c *DeConvBwdFiltAlgo) Algo0() DeConvBwdFiltAlgo

Algo0 sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_0) and returns value of c /* non-deterministic */

func (*DeConvBwdFiltAlgo) Algo1 ¶

func (c *DeConvBwdFiltAlgo) Algo1() DeConvBwdFiltAlgo

Algo1 sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_1) and returns value of c

func (*DeConvBwdFiltAlgo) Algo3 ¶

func (c *DeConvBwdFiltAlgo) Algo3() DeConvBwdFiltAlgo

Algo3 sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_3) and returns value of c

func (*DeConvBwdFiltAlgo) Count ¶

func (c *DeConvBwdFiltAlgo) Count() DeConvBwdFiltAlgo

Count sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_COUNT) and returns value of c

func (*DeConvBwdFiltAlgo) FFT ¶

func (c *DeConvBwdFiltAlgo) FFT() DeConvBwdFiltAlgo

FFT sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_FFT) and returns value of c

func (*DeConvBwdFiltAlgo) FFTTiling ¶

func (c *DeConvBwdFiltAlgo) FFTTiling() DeConvBwdFiltAlgo

FFTTiling sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_FFT_TILING) and returns value of c

func (DeConvBwdFiltAlgo) String ¶

func (c DeConvBwdFiltAlgo) String() string

func (*DeConvBwdFiltAlgo) Winograd ¶

func (c *DeConvBwdFiltAlgo) Winograd() DeConvBwdFiltAlgo

Winograd sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_WINOGRAD) and returns value of c

func (*DeConvBwdFiltAlgo) WinogradNonFused ¶

func (c *DeConvBwdFiltAlgo) WinogradNonFused() DeConvBwdFiltAlgo

WinogradNonFused sets c to ConvBwdFiltAlgo(C.CUDNN_CONVOLUTION_BWD_FILTER_ALGO_WINOGRAD_NONFUSED) and returns value of c

type DeConvBwdFiltAlgoPerformance ¶

type DeConvBwdFiltAlgoPerformance struct {
	Algo        DeConvBwdFiltAlgo `json:"algo,omitempty"`
	Status      Status            `json:"status,omitempty"`
	Time        float32           `json:"time,omitempty"`
	Memory      uint              `json:"memory,omitempty"`
	Determinism Determinism       `json:"determinism,omitempty"`
	MathType    MathType          `json:"math_type,omitempty"`
}

DeConvBwdFiltAlgoPerformance is the return struct in the finding algorithm funcs

func (DeConvBwdFiltAlgoPerformance) String ¶

func (cb DeConvBwdFiltAlgoPerformance) String() string

type DeConvBwdFilterPref ¶

type DeConvBwdFilterPref C.cudnnConvolutionBwdFilterPreference_t

DeConvBwdFilterPref are used for flags for the backwds filters exposing them through methods

func (*DeConvBwdFilterPref) NoWorkSpace ¶

func (c *DeConvBwdFilterPref) NoWorkSpace() DeConvBwdFilterPref

NoWorkSpace sets c to DeConvBwdFilterPref( C.CUDNN_CONVOLUTION_BWD_FILTER_NO_WORKSPACE) and returns value of c

func (*DeConvBwdFilterPref) PreferFastest ¶

func (c *DeConvBwdFilterPref) PreferFastest() DeConvBwdFilterPref

PreferFastest sets c to DeConvBwdFilterPref( C.CUDNN_CONVOLUTION_BWD_FILTER_PREFER_FASTEST) and returns value of c

func (*DeConvBwdFilterPref) SpecifyWorkSpaceLimit ¶

func (c *DeConvBwdFilterPref) SpecifyWorkSpaceLimit() DeConvBwdFilterPref

SpecifyWorkSpaceLimit sets c to DeConvBwdFilterPref( C.CUDNN_CONVOLUTION_BWD_FILTER_SPECIFY_WORKSPACE_LIMIT) and returns value of c

type DeConvFwdAlgo ¶

type DeConvFwdAlgo C.cudnnConvolutionBwdDataAlgo_t

DeConvFwdAlgo used for flags in the forward data algorithms exposing them through methods DeConvolution does the Backward Data pass as its forward.

func (DeConvFwdAlgo) Algo ¶

func (c DeConvFwdAlgo) Algo() Algorithm

Algo returns an Algorithm Struct

func (*DeConvFwdAlgo) Algo0 ¶

func (c *DeConvFwdAlgo) Algo0() DeConvFwdAlgo

Algo0 sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_0) and returns value of c /* non-deterministic */

func (*DeConvFwdAlgo) Algo1 ¶

func (c *DeConvFwdAlgo) Algo1() DeConvFwdAlgo

Algo1 sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_1) and returns value of c

func (*DeConvFwdAlgo) Count ¶

func (c *DeConvFwdAlgo) Count() DeConvFwdAlgo

Count sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_COUNT) and returns value of c

func (*DeConvFwdAlgo) FFT ¶

func (c *DeConvFwdAlgo) FFT() DeConvFwdAlgo

FFT sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_FFT) and returns value of c

func (*DeConvFwdAlgo) FFTTiling ¶

func (c *DeConvFwdAlgo) FFTTiling() DeConvFwdAlgo

FFTTiling sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_FFT_TILING) and returns value of c

func (DeConvFwdAlgo) String ¶

func (c DeConvFwdAlgo) String() string

func (*DeConvFwdAlgo) Winograd ¶

func (c *DeConvFwdAlgo) Winograd() DeConvFwdAlgo

Winograd sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_WINOGRAD) and returns value of c

func (*DeConvFwdAlgo) WinogradNonFused ¶

func (c *DeConvFwdAlgo) WinogradNonFused() DeConvFwdAlgo

WinogradNonFused sets c to ConvBwdDataAlgo(C.CUDNN_CONVOLUTION_BWD_DATA_ALGO_WINOGRAD_NONFUSED) and returns value of c

type DeConvFwdAlgoPerformance ¶

type DeConvFwdAlgoPerformance struct {
	Algo        DeConvFwdAlgo `json:"algo,omitempty"`
	Status      Status        `json:"status,omitempty"`
	Time        float32       `json:"time,omitempty"`
	Memory      uint          `json:"memory,omitempty"`
	Determinism Determinism   `json:"determinism,omitempty"`
	MathType    MathType      `json:"math_type,omitempty"`
}

DeConvFwdAlgoPerformance is a struct that holds the performance of the algorithm

func (DeConvFwdAlgoPerformance) String ¶

func (cb DeConvFwdAlgoPerformance) String() string

type DeConvolutionD ¶

type DeConvolutionD struct {
	// contains filtered or unexported fields
}

DeConvolutionD sets all the convolution info

func CreateDeConvolutionDescriptor ¶

func CreateDeConvolutionDescriptor() (*DeConvolutionD, error)

CreateDeConvolutionDescriptor creates a deconvolution descriptor

func (*DeConvolutionD) BackwardBias ¶

func (c *DeConvolutionD) BackwardBias(
	handle *Handle,
	alpha float64,
	dyD *TensorD,
	dy cutil.Mem,
	beta float64,
	dbD *TensorD,
	db cutil.Mem) error

BackwardBias is used to compute the bias gradient for batch convolution db is returned

func (*DeConvolutionD) BackwardBiasUS ¶

func (c *DeConvolutionD) BackwardBiasUS(
	handle *Handle,
	alpha float64,
	dyD *TensorD, dy unsafe.Pointer,
	beta float64,
	dbD *TensorD, db unsafe.Pointer) error

BackwardBiasUS is like BackwardBias but using unsafe.Pointer instead of cutil.Mem

func (*DeConvolutionD) BackwardData ¶

func (c *DeConvolutionD) BackwardData(
	handle *Handle,
	alpha float64,
	wD *FilterD, w cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	algo DeConvBwdDataAlgo,
	wspace cutil.Mem, wspaceSIB uint,
	beta float64,
	dxD *TensorD, dx cutil.Mem) error

BackwardData Function to perform the backward pass pass for batch convolution

func (*DeConvolutionD) BackwardDataUS ¶

func (c *DeConvolutionD) BackwardDataUS(
	handle *Handle,
	alpha float64,
	wD *FilterD, w unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	algo DeConvBwdDataAlgo,
	wspace unsafe.Pointer, wspacesize uint,
	beta float64,
	dxD *TensorD, dx unsafe.Pointer) error

BackwardDataUS is like BackwardData but using unsafe.Pointer instead of cutil.Mem

func (*DeConvolutionD) BackwardFilter ¶

func (c *DeConvolutionD) BackwardFilter(
	handle *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	algo DeConvBwdFiltAlgo,
	wspace cutil.Mem, wspacesize uint,
	beta float64,
	dwD *FilterD, dw cutil.Mem,
) error

BackwardFilter does the backwards deconvolution filter

func (*DeConvolutionD) BackwardFilterUS ¶

func (c *DeConvolutionD) BackwardFilterUS(
	handle *Handle,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	algo DeConvBwdFiltAlgo,
	wspace unsafe.Pointer, wspacesize uint,
	beta float64,
	dwD *FilterD, dw unsafe.Pointer,
) error

BackwardFilterUS is like BackwardFilter but using unsafe.Pointer instead of cutil.Mem

func (*DeConvolutionD) Destroy ¶

func (c *DeConvolutionD) Destroy() error

Destroy destroys the ConvolutionDescriptor. If GC is set then it only returns nil. Currently GC is set with no option to turn off

func (*DeConvolutionD) FindBackwardDataAlgorithm ¶

func (c *DeConvolutionD) FindBackwardDataAlgorithm(
	handle *Handle,
	w *FilterD,
	dy *TensorD,
	dx *TensorD,
) ([]DeConvBwdDataAlgoPerformance, error)

FindBackwardDataAlgorithm will find the top performing algoriths and return the best algorithms in accending order.

func (*DeConvolutionD) FindBackwardDataAlgorithmEx ¶

func (c *DeConvolutionD) FindBackwardDataAlgorithmEx(
	handle *Handle,
	wD *FilterD, w cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	dxD *TensorD, dx cutil.Mem,
	wspace cutil.Mem, wspacesize uint) ([]DeConvBwdDataAlgoPerformance, error)

FindBackwardDataAlgorithmEx finds some algorithms with memory

func (*DeConvolutionD) FindBackwardDataAlgorithmExUS ¶

func (c *DeConvolutionD) FindBackwardDataAlgorithmExUS(
	handle *Handle,
	wD *FilterD, w unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	dxD *TensorD, dx unsafe.Pointer,
	wspace unsafe.Pointer, wspacesize uint) ([]DeConvBwdDataAlgoPerformance, error)

FindBackwardDataAlgorithmExUS is just like FindBackwardDataAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*DeConvolutionD) FindBackwardFilterAlgorithm ¶

func (c *DeConvolutionD) FindBackwardFilterAlgorithm(
	handle *Handle,
	xD *TensorD,
	dyD *TensorD,
	dwD *FilterD,
) ([]DeConvBwdFiltAlgoPerformance, error)

FindBackwardFilterAlgorithm will find the top performing algoriths and return the best algorithms in accending order they are limited to the number passed in requestedAlgoCount. So if 4 is passed through in requestedAlgoCount, then it will return the top 4 performers in the ConvolutionFwdAlgoPerformance struct. using this could possible give the user cheat level performance :-)

func (*DeConvolutionD) FindBackwardFilterAlgorithmEx ¶

func (c *DeConvolutionD) FindBackwardFilterAlgorithmEx(
	handle *Handle,
	xD *TensorD, x cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	dwD *FilterD, dw cutil.Mem,
	wspace cutil.Mem, wspacesize uint) ([]DeConvBwdFiltAlgoPerformance, error)

FindBackwardFilterAlgorithmEx finds some algorithms with memory

func (*DeConvolutionD) FindBackwardFilterAlgorithmExUS ¶

func (c *DeConvolutionD) FindBackwardFilterAlgorithmExUS(
	handle *Handle,
	xD *TensorD, x unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	dwD *FilterD, dw unsafe.Pointer,
	wspace unsafe.Pointer, wspacesize uint) ([]DeConvBwdFiltAlgoPerformance, error)

FindBackwardFilterAlgorithmExUS is just like FindBackwardFilterAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*DeConvolutionD) FindForwardAlgorithm ¶

func (c *DeConvolutionD) FindForwardAlgorithm(
	handle *Handle,
	xD *TensorD,
	wD *FilterD,
	yD *TensorD,
) ([]DeConvFwdAlgoPerformance, error)

FindForwardAlgorithm will find the top performing algoriths and return the best algorithms in accending order they are limited to the number passed in requestedAlgoCount. So if 4 is passed through in requestedAlgoCount, then it will return the top 4 performers in the ConvFwdAlgoPerformance struct. using this could possible give the user cheat level performance :-)

func (*DeConvolutionD) FindForwardAlgorithmEx ¶

func (c *DeConvolutionD) FindForwardAlgorithmEx(
	handle *Handle,
	xD *TensorD,
	x cutil.Mem,
	wD *FilterD,
	w cutil.Mem,
	yD *TensorD,
	y cutil.Mem,
	wspace cutil.Mem,
	wspaceSIBlimit uint) ([]DeConvFwdAlgoPerformance, error)

FindForwardAlgorithmEx finds some algorithms with memory

func (*DeConvolutionD) FindForwardAlgorithmExUS ¶

func (c *DeConvolutionD) FindForwardAlgorithmExUS(
	handle *Handle,
	xD *TensorD,
	x unsafe.Pointer,
	wD *FilterD,
	w unsafe.Pointer,
	yD *TensorD,
	y unsafe.Pointer,
	wspace unsafe.Pointer,
	wspaceSIBlimit uint) ([]DeConvFwdAlgoPerformance, error)

FindForwardAlgorithmExUS is like FindForwardAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*DeConvolutionD) Forward ¶

func (c *DeConvolutionD) Forward(
	handle *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	wD *FilterD, w cutil.Mem,
	algo DeConvFwdAlgo,
	wspace cutil.Mem, wspaceSIB uint,
	beta float64,
	yD *TensorD, y cutil.Mem) error

Forward does the forward deconvolution

This function computes the convolution data gradient of the tensor dy, where y is the output of the forward convolution in (*ConvolutionD)Forward(). It uses the specified algo, and returns the results in the output tensor dx. Scaling factors alpha and beta can be used to scale the computed result or accumulate with the current dx.

func (*DeConvolutionD) ForwardUS ¶

func (c *DeConvolutionD) ForwardUS(
	handle *Handle,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	algo DeConvFwdAlgo,
	wspace unsafe.Pointer, wspacesize uint,
	beta float64,
	yD *TensorD, y unsafe.Pointer) error

ForwardUS is like BackwardData but uses unsafe.Pointer instead of cutil.Mem

func (*DeConvolutionD) Get ¶

func (c *DeConvolutionD) Get() (mode ConvolutionMode, data DataType, pad []int32, stride []int32, dilation []int32, err error)

Get gets returns the values used to make the convolution descriptor

func (*DeConvolutionD) GetBackwardDataAlgorithm ¶

func (c *DeConvolutionD) GetBackwardDataAlgorithm(
	handle *Handle,
	wD *FilterD,
	dyD *TensorD,
	dxD *TensorD,
	pref DeConvBwdDataPref, wspaceSIBlimit uint) (DeConvBwdDataAlgo, error)

GetBackwardDataAlgorithm gets the fastest backwards data algorithm with parameters that are passed.

func (*DeConvolutionD) GetBackwardDataAlgorithmV7 ¶

func (c *DeConvolutionD) GetBackwardDataAlgorithmV7(
	handle *Handle,
	wD *FilterD,
	dyD *TensorD,
	dxD *TensorD,
) ([]DeConvBwdDataAlgoPerformance, error)

GetBackwardDataAlgorithmV7 - This function serves as a heuristic for obtaining the best suited algorithm for the given layer specifications. This function will return all algorithms (including (MathType where available) sorted by expected (based on internal heuristic) relative performance with fastest being index 0 of perfResults.

func (*DeConvolutionD) GetBackwardDataWorkspaceSize ¶

func (c *DeConvolutionD) GetBackwardDataWorkspaceSize(
	handle *Handle,
	wD *FilterD,
	dyD *TensorD,
	dxD *TensorD,
	algo DeConvBwdDataAlgo) (uint, error)

GetBackwardDataWorkspaceSize is a helper function that will return the minimum Size of the workspace to be passed by the convolution given an algo.

func (*DeConvolutionD) GetBackwardFilterAlgorithm ¶

func (c *DeConvolutionD) GetBackwardFilterAlgorithm(
	handle *Handle,
	xD *TensorD,
	dyD *TensorD,
	dwD *FilterD,
	pref DeConvBwdFilterPref, wsmemlimit uint) (DeConvBwdFiltAlgo, error)

GetBackwardFilterAlgorithm gives a good algo with the limits given to it

func (*DeConvolutionD) GetBackwardFilterAlgorithmV7 ¶

func (c *DeConvolutionD) GetBackwardFilterAlgorithmV7(
	handle *Handle,
	xD *TensorD,
	dyD *TensorD,
	dwD *FilterD,
) ([]DeConvBwdFiltAlgoPerformance, error)

GetBackwardFilterAlgorithmV7 will find the top performing algoriths and return the best algorithms in accending order they are limited to the number passed in requestedAlgoCount. So if 4 is passed through in requestedAlgoCount, then it will return the top 4 performers in the ConvolutionFwdAlgoPerformance struct. using this could possible give the user cheat level performance :-)

func (*DeConvolutionD) GetBackwardFilterWorkspaceSize ¶

func (c *DeConvolutionD) GetBackwardFilterWorkspaceSize(
	handle *Handle,
	xD *TensorD,
	dyD *TensorD,
	dwD *FilterD,
	algo DeConvBwdFiltAlgo) (uint, error)

GetBackwardFilterWorkspaceSize is a helper function that will return the minimum Size of the workspace to be passed by the convolution given an algo.

func (*DeConvolutionD) GetBiasDims ¶

func (c *DeConvolutionD) GetBiasDims(w *FilterD) ([]int32, error)

GetBiasDims will return bias dims for the deconvolution. Ony supports NCHW and NHWC formats

func (*DeConvolutionD) GetForwardAlgorithm ¶

func (c *DeConvolutionD) GetForwardAlgorithm(
	handle *Handle,
	xD *TensorD,
	wD *FilterD,
	yD *TensorD,
	pref DeConvolutionForwardPref,
	wspaceSIBlimit uint) (DeConvFwdAlgo, error)

GetForwardAlgorithm gives a good algo with the limits given to it

func (*DeConvolutionD) GetForwardAlgorithmV7 ¶

func (c *DeConvolutionD) GetForwardAlgorithmV7(
	handle *Handle,
	xD *TensorD,
	wD *FilterD,
	yD *TensorD,
) ([]DeConvFwdAlgoPerformance, error)

GetForwardAlgorithmV7 will find the top performing algoriths and return the best algorithms in accending order they are limited to the number passed in requestedAlgoCount. So if 4 is passed through in requestedAlgoCount, then it will return the top 4 performers in the ConvFwdAlgoPerformance struct. using this could possible give the user cheat level performance :-)

func (*DeConvolutionD) GetForwardWorkspaceSize ¶

func (c *DeConvolutionD) GetForwardWorkspaceSize(
	handle *Handle,
	xD *TensorD,
	wD *FilterD,
	yD *TensorD,
	algo DeConvFwdAlgo) (uint, error)

GetForwardWorkspaceSize is a helper function that will return the minimum Size of the workspace to be passed by the convolution given an algo.

func (*DeConvolutionD) GetOutputDims ¶

func (c *DeConvolutionD) GetOutputDims(input *TensorD, filter *FilterD) ([]int32, error)

GetOutputDims is a helper function to give the size of the output of of a DeConvolutionNDForward Each dimension of the (nbDims-2)-D images of the output tensor is computed as followed:

outputDim = (inputDim-1)*convolutionStride -2*pad + (((filterDim-1)*dilation)+1)

DeConvolution works differently than a convolution.

In a normal convolution, the output channel will be the number of neurons it has. The channel size of the nuerons will be the input channel size.

For a deconvolution. The number of neurons will be the input channel size, and the neuron channel size will be the output channel size.

func (*DeConvolutionD) GetReorderType ¶

func (c *DeConvolutionD) GetReorderType() (r Reorder, err error)

GetReorderType gets the reorder type

func (*DeConvolutionD) Set ¶

func (c *DeConvolutionD) Set(mode ConvolutionMode, data DataType, pad, stride, dilation []int32) error

Set sets the convolution descriptor Input.Type of the filter layout format. If this input is set to CUDNN_TENSOR_NCHW, which is one of the enumerated values allowed by cudnnTensorFormat_t descriptor, then the layout of the filter is as follows:

	For N=4, i.e., for a 4D filter descriptor, the filter layout is in the form of KCRS (K represents the number of output feature maps, C the number of input feature maps, R the number of rows per filter, and S the number of columns per filter.)

	For N=3, i.e., for a 3D filter descriptor, the number S (number of columns per filter) is omitted.

	For N=5 and greater, the layout of the higher dimensions immediately follow RS.

	On the other hand, if this input is set to CUDNN_TENSOR_NHWC, then the layout of the filter is as follows:

	for N=4, i.e., for a 4D filter descriptor, the filter layout is in the form of KRSC.

	For N=3, i.e., for a 3D filter descriptor, the number S (number of columns per filter) is omitted, and the layout of C immediately follows R.

	For N=5 and greater, the layout of the higher dimensions are inserted between S and C. See also the description for cudnnTensorFormat_t.

	Note:

 Length of stride, pad, and dilation need to be len(tensordims) -2.

func (*DeConvolutionD) SetGroupCount ¶

func (c *DeConvolutionD) SetGroupCount(groupCount int32) error

SetGroupCount sets the Group Count

func (*DeConvolutionD) SetMathType ¶

func (c *DeConvolutionD) SetMathType(mathtype MathType) error

SetMathType sets the mathtype

func (*DeConvolutionD) SetReorderType ¶

func (c *DeConvolutionD) SetReorderType(r Reorder) error

SetReorderType sets the reorder type

func (*DeConvolutionD) String ¶

func (c *DeConvolutionD) String() string

String satisfies fmt Stringer interface.

type DeConvolutionForwardPref ¶

type DeConvolutionForwardPref C.cudnnConvolutionBwdDataPreference_t

DeConvolutionForwardPref used for flags on deconvolution forward exposing them through methods

func (*DeConvolutionForwardPref) NoWorkSpace ¶

func (c *DeConvolutionForwardPref) NoWorkSpace() DeConvolutionForwardPref

NoWorkSpace sets c to returns DeConvolutionForwardPref( C.CUDNN_CONVOLUTION_FWD_NO_WORKSPACE) and returns value of c

func (*DeConvolutionForwardPref) PreferFastest ¶

func (c *DeConvolutionForwardPref) PreferFastest() DeConvolutionForwardPref

PreferFastest sets c to DeConvolutionForwardPref( C.CUDNN_CONVOLUTION_FWD_PREFER_FASTEST) and returns value of c

func (*DeConvolutionForwardPref) SpecifyWorkSpaceLimit ¶

func (c *DeConvolutionForwardPref) SpecifyWorkSpaceLimit() DeConvolutionForwardPref

SpecifyWorkSpaceLimit sets c to ConvBwdDataPref( C.CUDNN_CONVOLUTION_FWD_SPECIFY_WORKSPACE_LIMIT)and returns value of c

type Debug ¶

type Debug C.cudnnDebug_t

Debug is Debug type

func (*Debug) String ¶

func (d *Debug) String() string

type Determinism ¶

type Determinism C.cudnnDeterminism_t

Determinism is the type for flags that set Determinism and are called and changed through type's methods

func (*Determinism) Deterministic ¶

func (d *Determinism) Deterministic() Determinism

Deterministic sets d to Determinism(C.CUDNN_DETERMINISTIC) and returns the value

func (*Determinism) Non ¶

func (d *Determinism) Non() Determinism

Non returns sets d to Determinism(C.CUDNN_NON_DETERMINISTIC) and returns the value

func (Determinism) String ¶

func (d Determinism) String() string

String outputs a string of the type

type DirectionMode ¶

type DirectionMode C.cudnnDirectionMode_t

DirectionMode is used for flags and exposes flags of type through types methods

func (*DirectionMode) Bi ¶

func (r *DirectionMode) Bi() DirectionMode

Bi sets r to and returns DirectionMode(C.CUDNN_BIDIRECTIONAL)

func (DirectionMode) String ¶

func (r DirectionMode) String() string

func (*DirectionMode) Uni ¶

func (r *DirectionMode) Uni() DirectionMode

Uni sets r to and returns DirectionMode(C.CUDNN_UNIDIRECTIONAL)

type DivNormMode ¶

type DivNormMode C.cudnnDivNormMode_t

DivNormMode is usde for C.cudnnDivNormMode_t flags

func (*DivNormMode) PrecomputedMeans ¶

func (d *DivNormMode) PrecomputedMeans() DivNormMode

PrecomputedMeans sets d to and returns DivNormMode(C.CUDNN_DIVNORM_PRECOMPUTED_MEANS)

func (DivNormMode) String ¶

func (d DivNormMode) String() string

type DropOutD ¶

type DropOutD struct {
	// contains filtered or unexported fields
}

DropOutD holds the dropout descriptor

func CreateDropOutDescriptor ¶

func CreateDropOutDescriptor() (*DropOutD, error)

CreateDropOutDescriptor creates a drop out descriptor to be set

func (*DropOutD) Backward ¶

func (d *DropOutD) Backward(
	handle *Handle,
	dyD *TensorD,
	dy cutil.Mem,
	dxD *TensorD,
	dx cutil.Mem,
	reserveSpace cutil.Mem,
	reservesize uint,
) error

Backward performs the dropoutForward

Input/Output: dx,reserveSpace

func (*DropOutD) BackwardUS ¶

func (d *DropOutD) BackwardUS(
	handle *Handle,
	dyD *TensorD,
	dy unsafe.Pointer,
	dxD *TensorD,
	dx unsafe.Pointer,
	reserveSpace unsafe.Pointer,
	reservesize uint,
) error

BackwardUS is like Backward but uses unsafe.Pointer instead of cutil.Mem

func (*DropOutD) Destroy ¶

func (d *DropOutD) Destroy() error

Destroy destroys the dropout descriptor unless the the finalizer flag was set.

func (*DropOutD) Forward ¶

func (d *DropOutD) Forward(
	handle *Handle,
	xD *TensorD,
	x cutil.Mem,
	yD *TensorD,
	y cutil.Mem,
	reserveSpace cutil.Mem,
	reservesize uint,
) error

Forward performs the dropoutForward

Input/Output: y,reserveSpace

func (*DropOutD) ForwardUS ¶

func (d *DropOutD) ForwardUS(
	handle *Handle,
	xD *TensorD, x unsafe.Pointer,
	yD *TensorD, y unsafe.Pointer,
	reserveSpace unsafe.Pointer, reservesize uint,
) error

ForwardUS is like Forward but uses unsafe.Pointer instead of cutil.Mem

func (*DropOutD) Get ¶

func (d *DropOutD) Get(
	handle *Handle,
) (float32, cutil.Mem, uint64, error)

Get gets the descriptor to a previously saved-off state

func (*DropOutD) GetReserveSpaceSize ¶

func (d *DropOutD) GetReserveSpaceSize(t *TensorD) (uint, error)

GetReserveSpaceSize returns the size of reserve space in bytes. Method calls a function that doesn't use the DropOutD, but function is releveant to the DropOut operation

func (*DropOutD) GetStateSize ¶

func (d *DropOutD) GetStateSize(handle *Handle) (uint, error)

GetStateSize returns the state size in bytes Method calls a function that doesn't use DropOutD, but it is a dropout type function, and is used to get the size the cutil.Mem, or unsafe.Pointer needs to for state.

func (*DropOutD) GetUS ¶

func (d *DropOutD) GetUS(handle *Handle) (float32, unsafe.Pointer, uint64, error)

GetUS is like GetUS but uses unsafe.Pointer instead of cutil.Mem

func (*DropOutD) Restore ¶

func (d *DropOutD) Restore(
	handle *Handle,
	dropout float32,
	states cutil.Mem,
	bytes uint,
	seed uint64,
) error

Restore restores the descriptor to a previously saved-off state

func (*DropOutD) RestoreUS ¶

func (d *DropOutD) RestoreUS(
	handle *Handle,
	dropout float32,
	states unsafe.Pointer,
	bytes uint,
	seed uint64,
) error

RestoreUS is like Restore but uses unsafe.Pointer instead of cutil.Mem

func (*DropOutD) Set ¶

func (d *DropOutD) Set(handle *Handle, dropout float32, states cutil.Mem, bytes uint, seed uint64) error

Set sets the drop out descriptor

func (*DropOutD) SetUS ¶

func (d *DropOutD) SetUS(handle *Handle, dropout float32, states unsafe.Pointer, bytes uint, seed uint64) error

SetUS is like Set but uses unsafe.Pointer instead of cutil.Mem

type ErrQueryMode ¶

type ErrQueryMode C.cudnnErrQueryMode_t

ErrQueryMode are basically flags that are used for different modes that are exposed through the types methods

func (*ErrQueryMode) Blocking ¶

func (e *ErrQueryMode) Blocking() ErrQueryMode

Blocking sets e to and returns ErrQueryMode(C.CUDNN_ERRQUERY_BLOCKING)

func (*ErrQueryMode) NonBlocking ¶

func (e *ErrQueryMode) NonBlocking() ErrQueryMode

NonBlocking sets e to and returns ErrQueryMode(C.CUDNN_ERRQUERY_NONBLOCKING)

func (*ErrQueryMode) RawCode ¶

func (e *ErrQueryMode) RawCode() ErrQueryMode

RawCode sets e to and returns ErrQueryMode(C.CUDNN_ERRQUERY_RAWCODE)

type FilterD ¶

type FilterD struct {
	// contains filtered or unexported fields
}

FilterD is the struct holding discriptor information for cudnnFilterDescriptor_t

func CreateFilterDescriptor ¶

func CreateFilterDescriptor() (*FilterD, error)

CreateFilterDescriptor creates a filter distriptor

func (*FilterD) Destroy ¶

func (f *FilterD) Destroy() error

Destroy Destroys Filter Descriptor if GC is not set. if GC is set then it won't do anything

func (*FilterD) Get ¶

func (f *FilterD) Get() (dtype DataType, frmt TensorFormat, shape []int32, err error)

Get returns a copy of the ConvolutionD

func (*FilterD) GetSizeInBytes ¶

func (f *FilterD) GetSizeInBytes() (uint, error)

GetSizeInBytes returns the size in bytes for the filter

func (*FilterD) ReorderFilterBias ¶

func (f *FilterD) ReorderFilterBias(h *Handle,
	r Reorder,
	filtersrc, reorderfilterdest cutil.Mem,
	reorderbias bool,
	biassrc, reorderbiasdest cutil.Mem) error

ReorderFilterBias -reorders the filter and bias values. It can be used to enhance the inference time by separating the reordering operation from convolution.

For example, convolutions in a neural network of multiple layers can require reordering of kernels at every layer, which can take up a significant fraction of the total inference time. Using this function, the reordering can be done one time on the filter and bias data followed by the convolution operations at the multiple layers, thereby enhancing the inference time.

func (*FilterD) Set ¶

func (f *FilterD) Set(dtype DataType, format TensorFormat, shape []int32) error

Set sets the filter descriptor Like with TensorD the shape put in will be not like cudnn. cudnn will always take the shape NCHW and switch the dims and change the tensor stride in order to change the format. gocudnn will change the dims to a format that cudnn likes. if the format is nhwc.

Basic 4D filter

The Basic NCHW shape is shape[0] = # of output feature maps

     	   shape[1] = # of input feature maps
	       shape[2] = height of each filter
	       shape[3] = width of each input filter

The Basic NHWC shape is shape[0] = # of output feature maps

						   shape[1] = height of each filter
						   shape[2] = width of each input filter
				     	   shape[3] = # of input feature maps

 Basic ND filter

The Basic NCHW shape is shape[0] = # of output feature maps

     	   shape[1]   = # of input feature maps
	       shape[.]   = feature dims
	       shape[N-1] = feature dims

The Basic NHWC shape is shape[0] = # of output feature maps

		   shape[.]   = feature dims
     	   shape[N-1] = # of input feature maps

func (*FilterD) String ¶

func (f *FilterD) String() string

type FoldingDirection ¶

type FoldingDirection C.cudnnFoldingDirection_t

FoldingDirection is used as a flag for TransformDescriptor which are revealed through type's methods.

func (*FoldingDirection) Fold ¶

func (f *FoldingDirection) Fold() FoldingDirection

Fold sets variable to Fold and returns Fold value

func (FoldingDirection) String ¶

func (f FoldingDirection) String() string

String satisfies the stringer interface

func (*FoldingDirection) UnFold ¶

func (f *FoldingDirection) UnFold() FoldingDirection

UnFold sets variable to UnFold and returns UnFold value

type Handle ¶

type Handle struct {
	// contains filtered or unexported fields
}

Handle is a struct containing a cudnnHandle_t which is basically a Pointer to a CUContext

func CreateHandle ¶

func CreateHandle(usegogc bool) *Handle

CreateHandle creates a handle its basically a Context usegogc is for future use. Right now it is always on the gc.

This function initializes the cuDNN library and creates a handle to an opaque structure holding the cuDNN library context. It allocates hardware resources on the host and device and must be called prior to making any other cuDNN library calls.

The cuDNN library handle is tied to the current CUDA device (context). To use the library on multiple devices, one cuDNN handle needs to be created for each device.

For a given device, multiple cuDNN handles with different configurations (e.g., different current CUDA streams) may be created. Because cudnnCreate allocates some internal resources, the release of those resources by calling cudnnDestroy will implicitly call cudaDeviceSynchronize; therefore, the recommended best practice is to call cudnnCreate/cudnnDestroy outside of performance-critical code paths.

For multithreaded applications that use the same device from different threads, the recommended programming model is to create one (or a few, as is convenient) cuDNN handle(s) per thread and use that cuDNN handle for the entire life of the thread.

func CreateHandleEX ¶

func CreateHandleEX(w *gocu.Worker, usegogc bool) *Handle

CreateHandleEX creates a handle like CreateHandle, but gocudnn functions that pass the handle will pass the operations to the worker. if w is nil the handle will function just like a handle created with CreateHandle()

func (*Handle) Destroy ¶

func (handle *Handle) Destroy() error

Destroy destroys the handle if GC is being use it won't do anything.

func (*Handle) GetStream ¶

func (handle *Handle) GetStream() (gocu.Streamer, error)

GetStream will return a stream that the handle is using

func (*Handle) Pointer ¶

func (handle *Handle) Pointer() unsafe.Pointer

Pointer is a pointer to the handle

func (*Handle) QueryRuntimeError ¶

func (handle *Handle) QueryRuntimeError(mode ErrQueryMode, tag *RuntimeTag) (Status, error)

QueryRuntimeError check cudnnQueryRuntimeError in DEEP Learning SDK Documentation tag should be nil

func (*Handle) SetStream ¶

func (handle *Handle) SetStream(s gocu.Streamer) error

SetStream passes a stream to sent in the cuda handle

type IndiciesType ¶

type IndiciesType C.cudnnIndicesType_t

IndiciesType are flags

func (IndiciesType) String ¶

func (i IndiciesType) String() string

String satisfies stringer interface

func (*IndiciesType) Type16Bit ¶

func (i *IndiciesType) Type16Bit() IndiciesType

Type16Bit sets i to and returns IndiciesType( C.CUDNN_16BIT_INDICES) flag

func (*IndiciesType) Type32Bit ¶

func (i *IndiciesType) Type32Bit() IndiciesType

Type32Bit sets i to and returns IndiciesType( C.CUDNN_32BIT_INDICES) flag

func (*IndiciesType) Type64Bit ¶

func (i *IndiciesType) Type64Bit() IndiciesType

Type64Bit sets i to and returns IndiciesType( C.CUDNN_64BIT_INDICES) flag

func (*IndiciesType) Type8Bit ¶

func (i *IndiciesType) Type8Bit() IndiciesType

Type8Bit sets i to and returns IndiciesType( C.CUDNN_8BIT_INDICES) flag

type LRND ¶

type LRND struct {
	// contains filtered or unexported fields
}

LRND holds the LRN Descriptor

func CreateLRNDescriptor ¶

func CreateLRNDescriptor() (*LRND, error)

CreateLRNDescriptor creates an RND descriptor

func (*LRND) Destroy ¶

func (l *LRND) Destroy() error

Destroy destroys the descriptor if not using gc it will just return nil if not on. Currently gc is always on

func (*LRND) DivisiveNormalizationBackward ¶

func (l *LRND) DivisiveNormalizationBackward(
	handle *Handle,
	mode DivNormMode,
	alpha float64,
	xD *TensorD, x, means, dy, temp, temp2 cutil.Mem,
	beta float64,
	dXdMeansDesc *TensorD, dx, dMeans cutil.Mem,
) error

DivisiveNormalizationBackward LRN cross-channel backward computation. Double parameters cast to tensor data type

func (*LRND) DivisiveNormalizationBackwardUS ¶

func (l *LRND) DivisiveNormalizationBackwardUS(
	handle *Handle,
	mode DivNormMode,
	alpha float64,
	xD *TensorD, x, means, dy, temp, temp2 unsafe.Pointer,
	beta float64,
	dXdMeansDesc *TensorD, dx, dMeans unsafe.Pointer,
) error

DivisiveNormalizationBackwardUS is like DivisiveNormalizationBackward but using unsafe.Pointer instead of cutil.Mem

func (*LRND) DivisiveNormalizationForward ¶

func (l *LRND) DivisiveNormalizationForward(
	handle *Handle,
	mode DivNormMode,
	alpha float64,
	xD TensorD, x, means, temp, temp2 cutil.Mem,
	beta float64,
	yD TensorD, y cutil.Mem,
) error

DivisiveNormalizationForward LCN/divisive normalization functions: y = alpha * normalize(x) + beta * y

func (*LRND) DivisiveNormalizationForwardUS ¶

func (l *LRND) DivisiveNormalizationForwardUS(
	handle *Handle,
	mode DivNormMode,
	alpha float64,
	xD TensorD, x, means, temp, temp2 unsafe.Pointer,
	beta float64,
	yD TensorD, y unsafe.Pointer,
) error

DivisiveNormalizationForwardUS is like DivisiveNormalizationForward but using unsafe.Pointer instead of cutil.Mem

func (*LRND) Get ¶

func (l *LRND) Get() (lrnN uint32, lrnAlpha float64, lrnBeta float64, lrnK float64, err error)

Get returns the descriptor values that were set with set

func (*LRND) LRNCrossChannelBackward ¶

func (l *LRND) LRNCrossChannelBackward(
	handle *Handle,
	mode LRNmode,
	alpha float64,
	yD *TensorD, y cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	xD *TensorD, x cutil.Mem,
	beta float64,
	dxD *TensorD, dx cutil.Mem,
) error

LRNCrossChannelBackward LRN cross-channel backward computation. Double parameters cast to tensor data type

func (*LRND) LRNCrossChannelBackwardUS ¶

func (l *LRND) LRNCrossChannelBackwardUS(
	handle *Handle,
	mode LRNmode,
	alpha float64,
	yD *TensorD, y unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	xD *TensorD, x unsafe.Pointer,
	beta float64,
	dxD *TensorD, dx unsafe.Pointer,
) error

LRNCrossChannelBackwardUS is like LRNCrossChannelBackward but using unsafe.Pointer instead of cutil.Mem

func (*LRND) LRNCrossChannelForward ¶

func (l *LRND) LRNCrossChannelForward(
	handle *Handle,
	mode LRNmode,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	beta float64,
	yD *TensorD, y cutil.Mem,
) error

LRNCrossChannelForward LRN cross-channel forward computation. Double parameters cast to tensor data type

func (*LRND) LRNCrossChannelForwardUS ¶

func (l *LRND) LRNCrossChannelForwardUS(
	handle *Handle,
	mode LRNmode,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	beta float64,
	yD *TensorD, y unsafe.Pointer,
) error

LRNCrossChannelForwardUS is like LRNCrossChannelForward but using unsafe.Pointer instead of cutil.Mem

func (LRND) MaxN ¶

func (l LRND) MaxN() uint32

MaxN returns the constant lrnmaxN

func (LRND) MinBeta ¶

func (l LRND) MinBeta() float64

MinBeta returns lrnminBeta constant

func (LRND) MinK ¶

func (l LRND) MinK() float64

MinK returns lrnminK constant

func (LRND) MinN ¶

func (l LRND) MinN() uint32

MinN returns the constant lrminN

func (*LRND) Set ¶

func (l *LRND) Set(lrnN uint32,
	lrnAlpha,
	lrnBeta,
	lrnK float64) error

Set sets the LRND

func (*LRND) String ¶

func (l *LRND) String() string

type LRNmode ¶

type LRNmode C.cudnnLRNMode_t

LRNmode is used for the flags in LRNmode

func (*LRNmode) CrossChanelDim1 ¶

func (l *LRNmode) CrossChanelDim1() LRNmode

CrossChanelDim1 sets l to and returns LRNmode( C.CUDNN_LRN_CROSS_CHANNEL_DIM1)

func (LRNmode) String ¶

func (l LRNmode) String() string

type MathType ¶

type MathType C.cudnnMathType_t

MathType are flags to set for cudnnMathType_t and can be called by types methods

func (*MathType) AllowConversion ¶

func (m *MathType) AllowConversion() MathType

AllowConversion return MathType(C.CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION)

func (*MathType) Default ¶

func (m *MathType) Default() MathType

Default sets m to MathType(C.CUDNN_DEFAULT_MATH) and returns changed value

func (MathType) String ¶

func (m MathType) String() string

String satisfies the stringer interface

func (*MathType) TensorOpMath ¶

func (m *MathType) TensorOpMath() MathType

TensorOpMath return MathType(C.CUDNN_TENSOR_OP_MATH)

type MultiHeadAttnWeightKind ¶

type MultiHeadAttnWeightKind C.cudnnMultiHeadAttnWeightKind_t

MultiHeadAttnWeightKind is a flag for the kind of weights used flags are exposed through type's methods.

func (*MultiHeadAttnWeightKind) Keys ¶

func (m *MultiHeadAttnWeightKind) Keys() MultiHeadAttnWeightKind

Keys - sets value to MultiHeadAttnWeightKind(C.CUDNN_MH_ATTN_K_WEIGHTS) and returns that value. From cudnn.h -input projection weights for 'keys'

func (*MultiHeadAttnWeightKind) Output ¶

func (m *MultiHeadAttnWeightKind) Output() MultiHeadAttnWeightKind

Output - sets value to MultiHeadAttnWeightKind(C.CUDNN_MH_ATTN_O_WEIGHTS) and returns that value. From cudnn.h - output projection weights

func (*MultiHeadAttnWeightKind) Queries ¶

func (m *MultiHeadAttnWeightKind) Queries() MultiHeadAttnWeightKind

Queries - sets value to MultiHeadAttnWeightKind(C.CUDNN_MH_ATTN_Q_WEIGHTS) and returns that value. From cudnn.h - input projection weights for 'queries'

func (MultiHeadAttnWeightKind) String ¶

func (m MultiHeadAttnWeightKind) String() string

func (*MultiHeadAttnWeightKind) Values ¶

func (m *MultiHeadAttnWeightKind) Values() MultiHeadAttnWeightKind

Values - sets value to MultiHeadAttnWeightKind(C.CUDNN_MH_ATTN_V_WEIGHTS) and returns that value. From cudnn.h - input projection weights for 'values'

type NANProp ¶

type NANProp C.cudnnNanPropagation_t

NANProp is type for C.cudnnNanPropagation_t used for flags and are called and changed through type's methods

func (*NANProp) NotPropigate ¶

func (p *NANProp) NotPropigate() NANProp

NotPropigate sets p to PropagationNAN(C.CUDNN_NOT_PROPAGATE_NAN) and returns that value

func (*NANProp) Propigate ¶

func (p *NANProp) Propigate() NANProp

Propigate sets p to PropagationNAN(C.CUDNN_PROPAGATE_NAN) and returns that value

func (NANProp) String ¶

func (p NANProp) String() string

String satisfies stringer interface.

type OPTensorD ¶

type OPTensorD struct {
	// contains filtered or unexported fields
}

OPTensorD holds OP Tensor information

func CreateOpTensorDescriptor ¶

func CreateOpTensorDescriptor() (*OPTensorD, error)

CreateOpTensorDescriptor creates and sets an OpTensor

func (*OPTensorD) Destroy ¶

func (t *OPTensorD) Destroy() error

Destroy destroys the descriptor

func (*OPTensorD) Get ¶

func (t *OPTensorD) Get() (op OpTensorOp, dtype DataType, nan NANProp, err error)

Get returns the descriptor information with error

func (*OPTensorD) OpTensor ¶

func (t *OPTensorD) OpTensor(
	handle *Handle,
	alpha1 float64,
	aD *TensorD, A cutil.Mem,
	alpha2 float64,
	bD *TensorD, B cutil.Mem,
	beta float64,
	cD *TensorD, cmem cutil.Mem) error

OpTensor performs an operation on some tensors C= operation( (alpha1 * A) , (alpha2 *B) ) + (beta *C)

func (*OPTensorD) OpTensorUS ¶

func (t *OPTensorD) OpTensorUS(
	handle *Handle,
	alpha1 float64,
	aD *TensorD, A unsafe.Pointer,
	alpha2 float64,
	bD *TensorD, B unsafe.Pointer,
	beta float64,
	cD *TensorD, cmem unsafe.Pointer) error

OpTensorUS is like OpTensor but uses unsafe.Pointer instead of cutil.Mem

func (*OPTensorD) Set ¶

func (t *OPTensorD) Set(op OpTensorOp, dtype DataType, nan NANProp) error

Set sets the OPTensorD.

func (*OPTensorD) String ¶

func (t *OPTensorD) String() string

type OpTensorOp ¶

type OpTensorOp C.cudnnOpTensorOp_t

OpTensorOp is used for flags for the Optensor functions

func (*OpTensorOp) Add ¶

func (o *OpTensorOp) Add() OpTensorOp

Add sets o to OpTensorOp(C.CUDNN_OP_TENSOR_ADD) and returns the new value

func (*OpTensorOp) Max ¶

func (o *OpTensorOp) Max() OpTensorOp

Max sets o to OpTensorOp(C.CUDNN_OP_TENSOR_MAX) and returns the new value

func (*OpTensorOp) Min ¶

func (o *OpTensorOp) Min() OpTensorOp

Min sets o to OpTensorOp(C.CUDNN_OP_TENSOR_MIN) and returns the new value

func (*OpTensorOp) Mul ¶

func (o *OpTensorOp) Mul() OpTensorOp

Mul sets o to OpTensorOp(C.CUDNN_OP_TENSOR_MUL) and returns the new value

func (*OpTensorOp) Not ¶

func (o *OpTensorOp) Not() OpTensorOp

Not returns OpTensorOp(C.CUDNN_OP_TENSOR_NOT) and returns the new value

func (*OpTensorOp) Sqrt ¶

func (o *OpTensorOp) Sqrt() OpTensorOp

Sqrt sets o to OpTensorOp(C.CUDNN_OP_TENSOR_SQRT) and returns the new value

func (OpTensorOp) String ¶

func (o OpTensorOp) String() string

type PersistentRNNPlan ¶

type PersistentRNNPlan struct {
	// contains filtered or unexported fields
}

PersistentRNNPlan holds C.cudnnPersistentRNNPlan_t

func (*PersistentRNNPlan) DestroyPersistentRNNPlan ¶

func (p *PersistentRNNPlan) DestroyPersistentRNNPlan() error

DestroyPersistentRNNPlan destroys the C.cudnnPersistentRNNPlan_t in the PersistentRNNPlan struct

type PoolingD ¶

type PoolingD struct {
	// contains filtered or unexported fields
}

PoolingD handles the pooling descriptor

func CreatePoolingDescriptor ¶

func CreatePoolingDescriptor() (*PoolingD, error)

CreatePoolingDescriptor creates a pooling descriptor.

func (*PoolingD) Backward ¶

func (p *PoolingD) Backward(
	handle *Handle,
	alpha float64,
	yD *TensorD, y cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	xD *TensorD, x cutil.Mem,
	beta float64,
	dxD *TensorD, dx cutil.Mem,
) error

Backward does the backward pooling operation

func (*PoolingD) BackwardUS ¶

func (p *PoolingD) BackwardUS(
	handle *Handle,
	alpha float64,
	yD *TensorD, y unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	xD *TensorD, x unsafe.Pointer,
	beta float64,
	dxD *TensorD, dx unsafe.Pointer,
) error

BackwardUS is like Backward but uses unsafe.Pointer instead of cutil.Mem

func (*PoolingD) Destroy ¶

func (p *PoolingD) Destroy() error

Destroy destroys the pooling descriptor.

Right now gocudnn is handle by the go GC exclusivly, but sometime in the future user of package will be be able to toggle it.

func (*PoolingD) Forward ¶

func (p *PoolingD) Forward(
	handle *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	beta float64,
	yD *TensorD, y cutil.Mem,
) error

Forward does the poolingForward operation

func (*PoolingD) ForwardUS ¶

func (p *PoolingD) ForwardUS(
	handle *Handle,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	beta float64,
	yD *TensorD, y unsafe.Pointer,
) error

ForwardUS is like Forward but uses unsafe.Pointer instead of cutil.Mem

func (*PoolingD) Get ¶

func (p *PoolingD) Get() (mode PoolingMode, nan NANProp, window, padding, stride []int32, err error)

Get gets the descriptor values for pooling

func (*PoolingD) GetOutputDims ¶

func (p *PoolingD) GetOutputDims(
	input *TensorD,
) ([]int32, error)

GetOutputDims will return the forward output dims from the pooling desc, and the tensor passed For NHWC gocudnn will take the cudnn dims (which are in NCHW) and convert it to NHWC.

func (*PoolingD) Set ¶

func (p *PoolingD) Set(mode PoolingMode, nan NANProp, window, padding, stride []int32) error

Set sets pooling descriptor to values passed

func (*PoolingD) String ¶

func (p *PoolingD) String() string

type PoolingMode ¶

type PoolingMode C.cudnnPoolingMode_t

PoolingMode is used for flags in pooling

func (*PoolingMode) AverageCountExcludePadding ¶

func (p *PoolingMode) AverageCountExcludePadding() PoolingMode

AverageCountExcludePadding returns PoolingMode(C.CUDNN_POOLING_AVERAGE_COUNT_EXCLUDE_PADDING) flag

Values inside the pooling window are averaged. The number of elements used to calculate the average excludes spatial locations falling in the padding region.

func (*PoolingMode) AverageCountIncludePadding ¶

func (p *PoolingMode) AverageCountIncludePadding() PoolingMode

AverageCountIncludePadding returns PoolingMode(C.CUDNN_POOLING_AVERAGE_COUNT_INCLUDE_PADDING) flag

Values inside the pooling window are averaged. The number of elements used to calculate the average includes spatial locations falling in the padding region.

func (*PoolingMode) Max ¶

func (p *PoolingMode) Max() PoolingMode

Max returns PoolingMode(C.CUDNN_POOLING_MAX) flag

The maximum value inside the pooling window is used.

func (*PoolingMode) MaxDeterministic ¶

func (p *PoolingMode) MaxDeterministic() PoolingMode

MaxDeterministic returns PoolingMode(C.CUDNN_POOLING_MAX_DETERMINISTIC) flag

The maximum value inside the pooling window is used. The algorithm used is deterministic.

func (PoolingMode) String ¶

func (p PoolingMode) String() string

type RNNAlgo ¶

type RNNAlgo C.cudnnRNNAlgo_t

RNNAlgo s used for flags and exposes the different flags through its methods

func (RNNAlgo) Algo ¶

func (r RNNAlgo) Algo() Algorithm

Algo returns an Algorithm used for

func (*RNNAlgo) Count ¶

func (r *RNNAlgo) Count() RNNAlgo

Count sets r to and returns RNNAlgo( C.CUDNN_RNN_ALGO_COUNT) flag

func (*RNNAlgo) PersistDynamic ¶

func (r *RNNAlgo) PersistDynamic() RNNAlgo

PersistDynamic sets r to and returns RNNAlgo( C.CUDNN_RNN_ALGO_PERSIST_DYNAMIC) flag

func (*RNNAlgo) PersistStatic ¶

func (r *RNNAlgo) PersistStatic() RNNAlgo

PersistStatic sets r to and returns RNNAlgo( C.CUDNN_RNN_ALGO_PERSIST_STATIC) flag

func (*RNNAlgo) Standard ¶

func (r *RNNAlgo) Standard() RNNAlgo

Standard sets r to and returns RNNAlgo( C.CUDNN_RNN_ALGO_STANDARD) flag

func (RNNAlgo) String ¶

func (r RNNAlgo) String() string

type RNNBiasMode ¶

type RNNBiasMode C.cudnnRNNBiasMode_t

RNNBiasMode handles bias flags for RNN. Flags are exposed through types methods

func (*RNNBiasMode) Double ¶

func (b *RNNBiasMode) Double() RNNBiasMode

Double sets b to and returns RNNBiasMode(C.CUDNN_RNN_DOUBLE_BIAS)

func (*RNNBiasMode) NoBias ¶

func (b *RNNBiasMode) NoBias() RNNBiasMode

NoBias sets b to and returns RNNBiasMode(C.CUDNN_RNN_NO_BIAS)

func (*RNNBiasMode) SingleINP ¶

func (b *RNNBiasMode) SingleINP() RNNBiasMode

SingleINP sets b to and returns RNNBiasMode(C.CUDNN_RNN_SINGLE_INP_BIAS)

func (*RNNBiasMode) SingleREC ¶

func (b *RNNBiasMode) SingleREC() RNNBiasMode

SingleREC sets b to and returns RNNBiasMode(C.CUDNN_RNN_SINGLE_REC_BIAS)

func (RNNBiasMode) String ¶

func (b RNNBiasMode) String() string

String satisfies the stringer interface

type RNNClipMode ¶

type RNNClipMode C.cudnnRNNClipMode_t

RNNClipMode is a flag for the clipmode for an RNN

func (*RNNClipMode) MinMax ¶

func (r *RNNClipMode) MinMax() RNNClipMode

MinMax sets r to and returns RNNClipMode(C.CUDNN_RNN_CLIP_MINMAX)

func (*RNNClipMode) None ¶

func (r *RNNClipMode) None() RNNClipMode

None sets r to and returns RNNClipMode(C.CUDNN_RNN_CLIP_NONE)

func (RNNClipMode) String ¶

func (r RNNClipMode) String() string

type RNND ¶

type RNND struct {
	// contains filtered or unexported fields
}

RNND holdes Rnn descriptor

func CreateRNNDescriptor ¶

func CreateRNNDescriptor() (desc *RNND, err error)

CreateRNNDescriptor creates an RNND descriptor

func (*RNND) BackwardDataEx ¶

func (r *RNND) BackwardDataEx(h *Handle,
	yD *RNNDataD, y cutil.Mem,
	dyD *RNNDataD, dy cutil.Mem,
	dhyD *TensorD, dhy cutil.Mem,
	dcyD *TensorD, dcy cutil.Mem,
	wD *FilterD, w cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	cxD *TensorD, cx cutil.Mem,
	dxD *RNNDataD, dx cutil.Mem,
	dhxD *TensorD, dhx cutil.Mem,
	dcxD *TensorD, dcx cutil.Mem,
	wspace cutil.Mem, wspacesib uint,
	rspace cutil.Mem, rspacesib uint) error

BackwardDataEx - Taken from cudnn documentation This routine is the extended version of the function cudnnRNNBackwardData. This function cudnnRNNBackwardDataEx allows the user to use unpacked (padded) layout for input y and output dx. In the unpacked layout, each sequence in the mini-batch is considered to be of fixed length, specified by maxSeqLength in its corresponding RNNDataDescriptor. Each fixed-length sequence, for example, the nth sequence in the mini-batch, is composed of a valid segment specified by the seqLengthArray[n] in its corresponding RNNDataDescriptor; and a padding segment to make the combined sequence length equal to maxSeqLength.

With the unpacked layout, both sequence major (i.e. time major) and batch major are supported. For backward compatibility, the packed sequence major layout is supported. However, similar to the non-extended function cudnnRNNBackwardData, the sequences in the mini-batch need to be sorted in descending order according to length.

Parameters:

handle is handle passed to all cudnn funcs. needs to be initialized before using.

yD -Input. A previously initialized RNN data descriptor.

Must match or be the exact same descriptor previously passed into ForwardTrainingEx.

y -Input. Data pointer to the GPU memory associated with the RNN data descriptor yD.

The vectors are expected to be laid out in memory according to the layout specified by yD.
The elements in the tensor (including elements in the padding vector) must be densely packed, and no strides are supported.
Must contain the exact same data previously produced by ForwardTrainingEx.

dyD -Input. A previously initialized RNN data descriptor.

The dataType, layout, maxSeqLength , batchSize, vectorSize and seqLengthArray need to match the yD previously passed to ForwardTrainingEx.

dy -Input.Data pointer to the GPU memory associated with the RNN data descriptor dyD.

The vectors are expected to be laid out in memory according to the layout specified by dyD.
The elements in the tensor (including elements in the padding vector) must be densely packed, and no strides are supported.

dhyD -Input. A fully packed tensor descriptor describing the gradients at the final hidden state of the RNN.

The first dimension of the tensor depends on the direction argument passed to the (*RNND)Set(params) call used to initialize rnnDesc.
Moreover:
If direction is CUDNN_UNIDIRECTIONAL the first dimension should match the numLayers argument passed to ((*RNND)Set(params).)
If direction is CUDNN_BIDIRECTIONAL the first dimension should match double the numLayers argument passed to (*RNND)Set(params).

The second dimension must match the batchSize parameter in xD.

The third dimension depends on whether RNN mode is CUDNN_LSTM and whether LSTM projection is enabled. Moreover:

If RNN mode is CUDNN_LSTM and LSTM projection is enabled, the third dimension must match the recProjSize argument passed to (*RNND)SetProjectionLayers(params) call used to set rnnDesc. Otherwise, the third dimension must match the hiddenSize argument passed to the (*RNND)Set(params) call used to initialize rnnDesc. dhy Input. Data pointer to GPU memory associated with the tensor descriptor dhyD. If a NULL pointer is passed, the gradients at the final hidden state of the network will be initialized to zero.

dcyD - Input. A fully packed tensor descriptor describing the gradients at the final cell state of the RNN. The first dimension of the tensor depends on the direction argument passed to the (*RNND)Set(params) call used to initialize rnnDesc. Moreover:

If direction is CUDNN_UNIDIRECTIONAL the first dimension should match the numLayers argument passed to (*RNND)Set(params).
If direction is CUDNN_BIDIRECTIONAL the first dimension should match double the numLayers argument passed to (*RNND)Set(params).
The second dimension must match the first dimension of the tensors described in xD.

The third dimension must match the hiddenSize argument passed to the (*RNND)Set(params) call used to initialize rnnDesc. The tensor must be fully packed.

dcy - Input. Data pointer to GPU memory associated with the tensor descriptor dcyD. If a NULL pointer is passed, the gradients at the final cell state of the network will be initialized to zero.

wD -Input. Handle to a previously initialized filter descriptor describing the weights for the RNN.

w -Input. Data pointer to GPU memory associated with the filter descriptor wD.

hxD -Input. A fully packed tensor descriptor describing the initial hidden state of the RNN. Must match or be the exact same descriptor previously passed into ForwardTrainingEx.

hx -Input. Data pointer to GPU memory associated with the tensor descriptor hxD. If a NULL pointer is passed, the initial hidden state of the network will be initialized to zero. Must contain the exact same data previously passed into ForwardTrainingEx, or be NULL if NULL was previously passed to ForwardTrainingEx.

cxD - Input. A fully packed tensor descriptor describing the initial cell state for LSTM networks. Must match or be the exact same descriptor previously passed into ForwardTrainingEx.

cx -Input. Data pointer to GPU memory associated with the tensor descriptor cxD. If a NULL pointer is passed, the initial cell state of the network will be initialized to zero. Must contain the exact same data previously passed into ForwardTrainingEx, or be NULL if NULL was previously passed to ForwardTrainingEx.

dxD - Input. A previously initialized RNN data descriptor. The dataType, layout, maxSeqLength, batchSize, vectorSize and seqLengthArray need to match that of xD previously passed to ForwardTrainingEx.

dx -Output. Data pointer to the GPU memory associated with the RNN data descriptor dxD. The vectors are expected to be laid out in memory according to the layout specified by dxD. The elements in the tensor (including elements in the padding vector) must be densely packed, and no strides are supported.

dhxD -Input. A fully packed tensor descriptor describing the gradient at the initial hidden state of the RNN. The descriptor must be set exactly the same way as dhyD.

dhx- Output. Data pointer to GPU memory associated with the tensor descriptor dhxD. If a NULL pointer is passed, the gradient at the hidden input of the network will not be set.

dcxD-Input. A fully packed tensor descriptor describing the gradient at the initial cell state of the RNN. The descriptor must be set exactly the same way as dcyD.

dcx -Output. Data pointer to GPU memory associated with the tensor descriptor dcxD. If a NULL pointer is passed, the gradient at the cell input of the network will not be set.

wspace - Input. Data pointer to GPU memory to be used as a wspace for this call. wspacesib - Input. Specifies the size in bytes of the provided wspace.

rspace - Input/Output. Data pointer to GPU memory to be used as a reserve space for this call. rspacesib - Input. Specifies the size in bytes of the provided rspace.

func (*RNND) BackwardDataExUS ¶

func (r *RNND) BackwardDataExUS(h *Handle,
	yD *RNNDataD, y unsafe.Pointer,
	dyD *RNNDataD, dy unsafe.Pointer,
	dhyD *TensorD, dhy unsafe.Pointer,
	dcyD *TensorD, dcy unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	cxD *TensorD, cx unsafe.Pointer,
	dxD *RNNDataD, dx unsafe.Pointer,
	dhxD *TensorD, dhx unsafe.Pointer,
	dcxD *TensorD, dcx unsafe.Pointer,
	wspace unsafe.Pointer, wspacesib uint,
	rspace unsafe.Pointer, rspacesib uint) error

BackwardDataExUS is like BackwardDataEx but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) BackwardWeights ¶

func (r *RNND) BackwardWeights(
	handle *Handle,
	xD []*TensorD, x cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	yD []*TensorD, y cutil.Mem,
	wspace cutil.Mem, wspacesize uint,
	dwD *FilterD, dw cutil.Mem,
	rspace cutil.Mem, rspacesize uint,
) error

BackwardWeights does the backward weight function

func (*RNND) BackwardWeightsEx ¶

func (r *RNND) BackwardWeightsEx(h *Handle,
	xD *RNNDataD, x cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	yD *RNNDataD, y cutil.Mem,
	wspace cutil.Mem, wspacesib uint,
	dwD *FilterD, dw cutil.Mem,
	rspace cutil.Mem, rspacesib uint,
) error

BackwardWeightsEx -from cudnn documentation This routine is the extended version of the function cudnnRNNBackwardWeights. This function cudnnRNNBackwardWeightsEx allows the user to use unpacked (padded) layout for input x and output dw. In the unpacked layout, each sequence in the mini-batch is considered to be of fixed length, specified by maxSeqLength in its corresponding RNNDataDescriptor. Each fixed-length sequence, for example, the nth sequence in the mini-batch, is composed of a valid segment specified by t he seqLengthArray[n] in its corresponding RNNDataDescriptor; and a padding segment to make the combined sequence length equal to maxSeqLength. With the unpacked layout, both sequence major (i.e. time major) and batch major are supported. For backward compatibility, the packed sequence major layout is supported. However, similar to the non-extended function cudnnRNNBackwardWeights, the sequences in the mini-batch need to be sorted in descending order according to length.

Parameters:

handle - Input. Handle to a previously created cuDNN context.

xD - Input. A previously initialized RNN data descriptor. Must match or

be the exact same descriptor previously passed into ForwardTrainingEx.

x - Input. Data pointer to GPU memory associated with the tensor descriptors

in the array xD. Must contain the exact same data previously passed into ForwardTrainingEx.

hxD - Input. A fully packed tensor descriptor describing the initial hidden state of the RNN.

Must match or be the exact same descriptor previously passed into ForwardTrainingEx.

hx - Input. Data pointer to GPU memory associated with the tensor descriptor hxD.

If a NULL pointer is passed, the initial hidden state of the network will be initialized to zero.
Must contain the exact same data previously passed into ForwardTrainingEx, or be NULL if NULL was previously passed to ForwardTrainingEx.

yD - Input. A previously initialized RNN data descriptor.

Must match or be the exact same descriptor previously passed into ForwardTrainingEx.

y -Input. Data pointer to GPU memory associated with the output tensor descriptor yD.

Must contain the exact same data previously produced by ForwardTrainingEx.

wspace - Input. Data pointer to GPU memory to be used as a wspace for this call.

wspacesib - Input. Specifies the size in bytes of the provided wspace.

dwD- Input. Handle to a previously initialized filter descriptor describing the gradients of the weights for the RNN.

dw - Input/Output. Data pointer to GPU memory associated with the filter descriptor dwD.

rspace - Input. Data pointer to GPU memory to be used as a reserve space for this call.

rspacesib - Input. Specifies the size in bytes of the provided rspace

func (*RNND) BackwardWeightsExUS ¶

func (r *RNND) BackwardWeightsExUS(h *Handle,
	xD *RNNDataD, x unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	yD *RNNDataD, y unsafe.Pointer,
	wspace unsafe.Pointer, wspacesib uint,
	dwD *FilterD, dw unsafe.Pointer,
	rspace unsafe.Pointer, rspacesib uint,
) error

BackwardWeightsExUS is like BackwardWeightsEx but with unsafe.Pointer instead of cutil.Mem

func (*RNND) BackwardWeightsUS ¶

func (r *RNND) BackwardWeightsUS(
	handle *Handle,
	xD []*TensorD, x unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	yD []*TensorD, y unsafe.Pointer,
	wspace unsafe.Pointer, wspacesize uint,
	dwD *FilterD, dw unsafe.Pointer,
	rspace unsafe.Pointer, rspacesize uint,
) error

BackwardWeightsUS is like BackwardWeights but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) Destroy ¶

func (r *RNND) Destroy() error

Destroy destroys the descriptor Right now this doesn't work because gocudnn uses go's GC.

func (*RNND) FindRNNBackwardDataAlgorithmEx ¶

func (r *RNND) FindRNNBackwardDataAlgorithmEx(
	handle *Handle,
	yD []*TensorD, y cutil.Mem,
	dyD []*TensorD, dy cutil.Mem,
	dhyD *TensorD, dhy cutil.Mem,
	dcyD *TensorD, dcy cutil.Mem,
	wD *FilterD, w cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	cxD *TensorD, cx cutil.Mem,
	dxD []*TensorD, dx cutil.Mem,
	dhxD *TensorD, dhx cutil.Mem,
	dcxD *TensorD, dcx cutil.Mem,
	findIntensity float32,
	wspace cutil.Mem, wspacesize uint,
	rspace cutil.Mem, rspacesize uint,

) ([]AlgorithmPerformance, error)

FindRNNBackwardDataAlgorithmEx finds a list of Algorithm for backprop this passes like 26 parameters and pointers and stuff so watch out.

func (*RNND) FindRNNBackwardDataAlgorithmExUS ¶

func (r *RNND) FindRNNBackwardDataAlgorithmExUS(
	handle *Handle,
	yD []*TensorD, y unsafe.Pointer,
	dyD []*TensorD, dy unsafe.Pointer,
	dhyD *TensorD, dhy unsafe.Pointer,
	dcyD *TensorD, dcy unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	cxD *TensorD, cx unsafe.Pointer,
	dxD []*TensorD, dx unsafe.Pointer,
	dhxD *TensorD, dhx unsafe.Pointer,
	dcxD *TensorD, dcx unsafe.Pointer,
	findIntensity float32,
	wspace unsafe.Pointer, wspacesize uint,
	rspace unsafe.Pointer, rspacesize uint,

) ([]AlgorithmPerformance, error)

FindRNNBackwardDataAlgorithmExUS is like FindRNNBackwardDataAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) FindRNNBackwardWeightsAlgorithmEx ¶

func (r *RNND) FindRNNBackwardWeightsAlgorithmEx(
	handle *Handle,
	xD []*TensorD, x cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	yD []*TensorD, y cutil.Mem,
	findIntensity float32,
	wspace cutil.Mem, wspacesize uint,
	dwD *FilterD, dw cutil.Mem,
	rspace cutil.Mem, rspacesize uint,

) ([]AlgorithmPerformance, error)

FindRNNBackwardWeightsAlgorithmEx returns some Algorithm and their performance and stuff

func (*RNND) FindRNNBackwardWeightsAlgorithmExUS ¶

func (r *RNND) FindRNNBackwardWeightsAlgorithmExUS(
	handle *Handle,
	xD []*TensorD, x unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	yD []*TensorD, y unsafe.Pointer,
	findIntensity float32,
	wspace unsafe.Pointer, wspacesize uint,
	dwD *FilterD, dw unsafe.Pointer,
	rspace unsafe.Pointer, rspacesize uint,

) ([]AlgorithmPerformance, error)

FindRNNBackwardWeightsAlgorithmExUS is like FindRNNBackwardWeightsAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) FindRNNForwardInferenceAlgorithmEx ¶

func (r *RNND) FindRNNForwardInferenceAlgorithmEx(
	handle *Handle,
	xD []*TensorD,
	x cutil.Mem,
	hxD *TensorD,
	hx cutil.Mem,
	cxD *TensorD,
	cx cutil.Mem,
	wD *FilterD,
	w cutil.Mem,
	yD []*TensorD,
	y cutil.Mem,
	hyD *TensorD,
	hy cutil.Mem,
	cyD *TensorD,
	cy cutil.Mem,
	findIntensity float32,
	wspace cutil.Mem, wspacesize uint,
) ([]AlgorithmPerformance, error)

FindRNNForwardInferenceAlgorithmEx finds the inference algorithmEx

func (*RNND) FindRNNForwardInferenceAlgorithmExUS ¶

func (r *RNND) FindRNNForwardInferenceAlgorithmExUS(
	handle *Handle,
	xD []*TensorD,
	x unsafe.Pointer,
	hxD *TensorD,
	hx unsafe.Pointer,
	cxD *TensorD,
	cx unsafe.Pointer,
	wD *FilterD,
	w unsafe.Pointer,
	yD []*TensorD,
	y unsafe.Pointer,
	hyD *TensorD,
	hy unsafe.Pointer,
	cyD *TensorD,
	cy unsafe.Pointer,
	findIntensity float32,
	wspace unsafe.Pointer, wspacesize uint,
) ([]AlgorithmPerformance, error)

FindRNNForwardInferenceAlgorithmExUS is like FindRNNForwardInferenceAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) FindRNNForwardTrainingAlgorithmEx ¶

func (r *RNND) FindRNNForwardTrainingAlgorithmEx(
	handle *Handle,
	xD []*TensorD,
	x cutil.Mem,
	hxD *TensorD,
	hx cutil.Mem,
	cxD *TensorD,
	cx cutil.Mem,
	wD *FilterD,
	w cutil.Mem,
	yD []*TensorD,
	y cutil.Mem,
	hyD *TensorD,
	hy cutil.Mem,
	cyD *TensorD,
	cy cutil.Mem,
	findIntensity float32,
	reqAlgocount int32,
	wspace cutil.Mem,
	wspacesize uint,
	rspace cutil.Mem,
	rspacesize uint,

) ([]AlgorithmPerformance, error)

FindRNNForwardTrainingAlgorithmEx finds and orders the performance of rnn Algorithm for training returns that list with an error

func (*RNND) FindRNNForwardTrainingAlgorithmExUS ¶

func (r *RNND) FindRNNForwardTrainingAlgorithmExUS(
	handle *Handle,
	xD []*TensorD, x unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	cxD *TensorD, cx unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	yD []*TensorD, y unsafe.Pointer,
	hyD *TensorD, hy unsafe.Pointer,
	cyD *TensorD, cy unsafe.Pointer,
	findIntensity float32,
	wspace unsafe.Pointer, wspacesize uint,
	rspace unsafe.Pointer, rspacesize uint,

) ([]AlgorithmPerformance, error)

FindRNNForwardTrainingAlgorithmExUS is like FindRNNForwardTrainingAlgorithmEx but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) ForwardInferenceEx ¶

func (r *RNND) ForwardInferenceEx(
	h *Handle,
	xD *RNNDataD, x cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	cxD *TensorD, cx cutil.Mem,
	wD *FilterD, w cutil.Mem,
	yD *RNNDataD, y cutil.Mem,
	hyD *TensorD, hy cutil.Mem,
	cyD *TensorD, cy cutil.Mem,
	wspace cutil.Mem, wspacesib uint,
) error

ForwardInferenceEx - from cudnn documentation This routine is the extended version of the cudnnRNNForwardInference function. The ForwardTrainingEx allows the user to use unpacked (padded) layout for input x and output y. In the unpacked layout, each sequence in the mini-batch is considered to be of fixed length, specified by maxSeqLength in its corresponding RNNDataDescriptor. Each fixed-length sequence, for example, the nth sequence in the mini-batch, is composed of a valid segment, specified by the seqLengthArray[n] in its corresponding RNNDataDescriptor, and a padding segment to make the combined sequence length equal to maxSeqLength.

With unpacked layout, both sequence major (i.e. time major) and batch major are supported. For backward compatibility, the packed sequence major layout is supported. However, similar to the non-extended function cudnnRNNForwardInference, the sequences in the mini-batch need to be sorted in descending order according to length.

Parameters ¶

handle - Input. Handle to a previously created cuDNN context.

xD- Input. A previously initialized RNN Data descriptor. The dataType, layout, maxSeqLength , batchSize, and seqLengthArray need to match that of yD. x -Input. Data pointer to the GPU memory associated with the RNN data descriptor xD. The vectors are expected to be laid out in memory according to the layout specified by xD.

The elements in the tensor (including elements in the padding vector) must be densely packed, and no strides are supported.

hxD - Input. A fully packed tensor descriptor describing the initial hidden state of the RNN. The first dimension of the tensor depends on the direction argument passed to the cudnnSetRNNDescriptor call used to initialize rnnDesc:

If direction is CUDNN_UNIDIRECTIONAL the first dimension should match the numLayers argument passed to cudnnSetRNNDescriptor.
If direction is CUDNN_BIDIRECTIONAL the first dimension should match double the numLayers argument passed to cudnnSetRNNDescriptor.
The second dimension must match the batchSize parameter described in xD.
The third dimension depends on whether RNN mode is CUDNN_LSTM and whether LSTM projection is enabled. In specific:
If RNN mode is CUDNN_LSTM and LSTM projection is enabled, the third dimension must match the recProjSize argument passed to cudnnSetRNNProjectionLayers call used to set rnnDesc.
Otherwise, the third dimension must match the hiddenSize argument passed to the cudnnSetRNNDescriptor call used to initialize rnnDesc.

hx - Input. Data pointer to GPU memory associated with the tensor descriptor hxD. If a NULL pointer is passed, the initial hidden state of the network will be initialized to zero.

cxD -Input. A fully packed tensor descriptor describing the initial cell state for LSTM networks.

The first dimension of the tensor depends on the direction argument passed to the cudnnSetRNNDescriptor call used to initialize rnnDesc:
If direction is CUDNN_UNIDIRECTIONAL the first dimension should match the numLayers argument passed to cudnnSetRNNDescriptor.
If direction is CUDNN_BIDIRECTIONAL the first dimension should match double the numLayers argument passed to cudnnSetRNNDescriptor.
The second dimension must match the batchSize parameter in xD. The third dimension must match the hiddenSize argument passed to the cudnnSetRNNDescriptor call used to initialize rnnDesc.

cx - Input. Data pointer to GPU memory associated with the tensor descriptor cxD.

If a NULL pointer is passed, the initial cell state of the network will be initialized to zero.

wD - Input. Handle to a previously initialized filter descriptor describing the weights for the RNN.

w - Input. Data pointer to GPU memory associated with the filter descriptor wD.

yD - Input. A previously initialized RNN data descriptor. The dataType, layout, maxSeqLength , batchSize, and seqLengthArray must match that of dyD and dxD.

The parameter vectorSize depends on whether RNN mode is CUDNN_LSTM and whether LSTM projection is enabled and whether the network is bidirectional.
In specific: For uni-directional network, if RNN mode is CUDNN_LSTM and LSTM projection is enabled,
the parameter vectorSize must match the recProjSize argument passed to cudnnSetRNNProjectionLayers call used to set rnnDesc.
If the network is bidirectional, then multiply the value by 2.
Otherwise, for uni-directional network, the parameter vectorSize must match the hiddenSize argument passed
to the cudnnSetRNNDescriptor call used to initialize rnnDesc. If the network is bidirectional, then multiply the value by 2.

y - Output. Data pointer to the GPU memory associated with the RNN data descriptor yD.

The vectors are expected to be laid out in memory according to the layout specified by yD.
The elements in the tensor (including elements in the padding vector) must be densely packed, and no strides are supported.

hyD - Input. A fully packed tensor descriptor describing the final hidden state of the RNN. The descriptor must be set exactly the same way as hxD.

hy - Output. Data pointer to GPU memory associated with the tensor descriptor hyD. If a NULL pointer is passed, the final hidden state of the network will not be saved.

cyD - Input. A fully packed tensor descriptor describing the final cell state for LSTM networks. The descriptor must be set exactly the same way as cxD.

cy -Output. Data pointer to GPU memory associated with the tensor descriptor cyD. If a NULL pointer is passed, the final cell state of the network will be not be saved.

wspace - Input. Data pointer to GPU memory to be used as a wspace for this call.

wspacesib - Input. Specifies the size in bytes of the provided wspace.

func (*RNND) ForwardInferenceExUS ¶

func (r *RNND) ForwardInferenceExUS(
	h *Handle,
	xD *RNNDataD, x unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	cxD *TensorD, cx unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	yD *RNNDataD, y unsafe.Pointer,
	hyD *TensorD, hy unsafe.Pointer,
	cyD *TensorD, cy unsafe.Pointer,
	wspace unsafe.Pointer, wspacesib uint,
) error

ForwardInferenceExUS is like ForwardInferenceEx but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) ForwardTrainingEx ¶

func (r *RNND) ForwardTrainingEx(h *Handle,
	xD *RNNDataD, x cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	cxD *TensorD, cx cutil.Mem,
	wD *FilterD, w cutil.Mem,
	yD *RNNDataD, y cutil.Mem,
	hyD *TensorD, hy cutil.Mem,
	cyD *TensorD, cy cutil.Mem,
	wspace cutil.Mem, wspacesib uint,
	rspace cutil.Mem, rspacesib uint) error

ForwardTrainingEx - From cudnn documentation This routine is the extended version of the cudnnRNNForwardTraining function. The ForwardTrainingEx allows the user to use unpacked (padded) layout for input x and output y. In the unpacked layout, each sequence in the mini-batch is considered to be of fixed length, specified by maxSeqLength in its corresponding RNNDataDescriptor. Each fixed-length sequence, for example, the nth sequence in the mini-batch, is composed of a valid segment specified by the seqLengthArray[n] in its corresponding RNNDataDescriptor; and a padding segment to make the combined sequence length equal to maxSeqLength. With the unpacked layout, both sequence major (i.e. time major) and batch major are supported. For backward compatibility, the packed sequence major layout is supported. However, similar to the non-extended function cudnnRNNForwardTraining, the sequences

in the mini-batch need to be sorted in descending order according to length.

Parameters:

handle - Input. Handle to a previously created cuDNN context.

xD - Input. A previously initialized RNN Data descriptor. The dataType, layout, maxSeqLength , batchSize, and seqLengthArray need to match that of yD.

x - Input. Data pointer to the GPU memory associated with the RNN data descriptor xD.

The input vectors are expected to be laid out in memory according to the layout specified by xD.
The elements in the tensor (including elements in the padding vector) must be densely packed, and no strides are supported.

hxD - Input. A fully packed tensor descriptor describing the initial hidden state of the RNN.

The first dimension of the tensor depends on the direction argument passed to the cudnnSetRNNDescriptor call used to initialize rnnDesc. Moreover:
If direction is CUDNN_UNIDIRECTIONAL then the first dimension should match the numLayers argument passed to cudnnSetRNNDescriptor.
If direction is CUDNN_BIDIRECTIONAL then the first dimension should match double the numLayers argument passed to cudnnSetRNNDescriptor.
The second dimension must match the batchSize parameter in xD.
The third dimension depends on whether RNN mode is CUDNN_LSTM and whether LSTM projection is enabled. Moreover:
If RNN mode is CUDNN_LSTM and LSTM projection is enabled, the third dimension must match the
recProjSize argument passed to cudnnSetRNNProjectionLayers call used to set rnnDesc.
Otherwise, the third dimension must match the hiddenSize argument passed to the cudnnSetRNNDescriptor call used to initialize rnnDesc .

hx - Input. Data pointer to GPU memory associated with the tensor descriptor hxD.

If a NULL pointer is passed, the initial hidden state of the network will be initialized to zero.

cxD - Input. A fully packed tensor descriptor describing the initial cell state for LSTM networks.

The first dimension of the tensor depends on the direction argument passed to the cudnnSetRNNDescriptor call used to initialize rnnDesc. Moreover:
If direction is CUDNN_UNIDIRECTIONAL the first dimension should match the numLayers argument passed to cudnnSetRNNDescriptor.
If direction is CUDNN_BIDIRECTIONAL the first dimension should match double the numLayers argument passed to cudnnSetRNNDescriptor.
The second dimension must match the first dimension of the tensors described in xD.
The third dimension must match the hiddenSize argument passed to the cudnnSetRNNDescriptor call used to initialize rnnDesc. The tensor must be fully packed.

cx - Input. Data pointer to GPU memory associated with the tensor descriptor cxD. If a NULL pointer

is passed, the initial cell state of the network will be initialized to zero.

wD - Input. Handle to a previously initialized filter descriptor describing the weights for the RNN.

w- Input. Data pointer to GPU memory associated with the filter descriptor wD.

yD - Input. A previously initialized RNN data descriptor. The dataType, layout, maxSeqLength , batchSize, and seqLengthArray

        need to match that of dyD and dxD. The parameter vectorSize depends on whether RNN mode is CUDNN_LSTM and
		whether LSTM projection is enabled and whether the network is bidirectional.
		In specific: For uni-directional network, if RNN mode is CUDNN_LSTM and LSTM projection is enabled,
		the parameter vectorSize must match the recProjSize argument passed to cudnnSetRNNProjectionLayers
		call used to set rnnDesc. If the network is bidirectional, then multiply the value by 2.
		Otherwise, for uni-directional network, the parameter vectorSize must match the
		hiddenSize argument passed to the cudnnSetRNNDescriptor call used
		to initialize rnnDesc. If the network is bidirectional, then multiply the value by 2.

y - Output. Data pointer to GPU memory associated with the RNN data descriptor yD.

The input vectors are expected to be laid out in memory according to the layout
specified by yD. The elements in the tensor (including elements in the padding vector)
must be densely packed, and no strides are supported.

hyD - Input. A fully packed tensor descriptor describing the final hidden state of the RNN. The descriptor must be set exactly the same as hxD.

hy - Output. Data pointer to GPU memory associated with the tensor descriptor hyD. If a NULL pointer is passed, the final hidden state of the network will not be saved.

cyD - Input. A fully packed tensor descriptor describing the final cell state for LSTM networks. The descriptor must be set exactly the same as cxD.

cy- Output. Data pointer to GPU memory associated with the tensor descriptor cyD. If a NULL pointer is passed, the final cell state of the network will be not be saved.

wspace - Input. Data pointer to GPU memory to be used as a wspace for this call.

wspacesib - Input. Specifies the size in bytes of the provided wspace.

rspace -Input/Output. Data pointer to GPU memory to be used as a reserve space for this call.

rspacesib - Input. Specifies the size in bytes of the provided rspace

func (*RNND) ForwardTrainingExUS ¶

func (r *RNND) ForwardTrainingExUS(h *Handle,
	xD *RNNDataD, x unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	cxD *TensorD, cx unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	yD *RNNDataD, y unsafe.Pointer,
	hyD *TensorD, hy unsafe.Pointer,
	cyD *TensorD, cy unsafe.Pointer,
	wspace unsafe.Pointer, wspacesib uint,
	rspace unsafe.Pointer, rspacesib uint) error

ForwardTrainingExUS is like ForwardTrainingEx but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) Get ¶

func (r *RNND) Get(
	handle *Handle,
) (int32, int32, *DropOutD, RNNInputMode, DirectionMode, RNNmode, RNNAlgo, DataType, error)

Get gets RNND values that were set

func (*RNND) GetBiasMode ¶

func (r *RNND) GetBiasMode() (bmode RNNBiasMode, err error)

GetBiasMode gets bias mode for descriptor

func (*RNND) GetClip ¶

func (r *RNND) GetClip(h *Handle) (mode RNNClipMode, nanprop NANProp, lclip, rclip float64, err error)

GetClip returns the clip settings for the descriptor

func (*RNND) GetLinLayerMatrixParams ¶

func (r *RNND) GetLinLayerMatrixParams(
	handle *Handle,
	pseudoLayer int32,

	xD *TensorD,
	wD *FilterD, w cutil.Mem,
	linlayerID int32,

) (FilterD, unsafe.Pointer, error)

GetLinLayerMatrixParams gets the parameters of the layer matrix

func (*RNND) GetLinLayerMatrixParamsUS ¶

func (r *RNND) GetLinLayerMatrixParamsUS(
	handle *Handle,
	pseudoLayer int32,

	xD *TensorD,
	wD *FilterD, w unsafe.Pointer,
	linlayerID int32,

) (FilterD, cutil.Mem, error)

GetLinLayerMatrixParamsUS is like GetLinLayerMatrixParamsUS but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) GetPaddingMode ¶

func (r *RNND) GetPaddingMode() (mode RNNPaddingMode, err error)

GetPaddingMode gets padding mode for the descriptor

func (*RNND) GetParamsSIB ¶

func (r *RNND) GetParamsSIB(
	handle *Handle,
	xD *TensorD,
	data DataType,
) (uint, error)

GetParamsSIB gets the training reserve size

func (*RNND) GetProjectionLayers ¶

func (r *RNND) GetProjectionLayers(
	handle *Handle,
) (int32, int32, error)

GetProjectionLayers sets the rnnprojection layers

func (*RNND) GetRNNLinLayerBiasParams ¶

func (r *RNND) GetRNNLinLayerBiasParams(
	handle *Handle,
	pseudoLayer int32,

	xD *TensorD,
	wD *FilterD,
	w cutil.Mem,
	linlayerID int32,

) (BiasD *FilterD, Bias cutil.Mem, err error)

GetRNNLinLayerBiasParams gets the parameters of the layer bias

func (*RNND) GetRNNLinLayerBiasParamsUS ¶

func (r *RNND) GetRNNLinLayerBiasParamsUS(
	handle *Handle,
	pseudoLayer int32,

	xD *TensorD,
	wD *FilterD,
	w unsafe.Pointer,
	linlayerID int32,

) (BiasD *FilterD, Bias unsafe.Pointer, err error)

GetRNNLinLayerBiasParamsUS is like GetRNNLinLayerBiasParams but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) GetRNNMatrixMathType ¶

func (r *RNND) GetRNNMatrixMathType() (MathType, error)

GetRNNMatrixMathType Gets the math type for the descriptor

func (*RNND) GetReserveSIB ¶

func (r *RNND) GetReserveSIB(
	handle *Handle,
	seqLength int32,
	xD []*TensorD,
) (uint, error)

GetReserveSIB gets the training reserve size

func (*RNND) GetWorkspaceSIB ¶

func (r *RNND) GetWorkspaceSIB(
	handle *Handle,
	seqLength int32,
	xD []*TensorD,
) (uint, error)

GetWorkspaceSIB gets the RNN workspace size (WOW!)

func (*RNND) NewPersistentRNNPlan ¶

func (r *RNND) NewPersistentRNNPlan(minibatch int32, data DataType) (plan *PersistentRNNPlan, err error)

NewPersistentRNNPlan creates and sets a PersistentRNNPlan

func (*RNND) RNNBackwardData ¶

func (r *RNND) RNNBackwardData(
	handle *Handle,
	yD []*TensorD, y cutil.Mem,
	dyD []*TensorD, dy cutil.Mem,
	dhyD *TensorD, dhy cutil.Mem,
	dcyD *TensorD, dcy cutil.Mem,
	wD *FilterD, w cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	cxD *TensorD, cx cutil.Mem,
	dxD []*TensorD, dx cutil.Mem,
	dhxD *TensorD, dhx cutil.Mem,
	dcxD *TensorD, dcx cutil.Mem,
	wspace cutil.Mem, wspacesize uint,
	rspace cutil.Mem, rspacesize uint,
) error

RNNBackwardData is the backward algo for an RNN

func (*RNND) RNNBackwardDataUS ¶

func (r *RNND) RNNBackwardDataUS(
	handle *Handle,
	yD []*TensorD, y unsafe.Pointer,
	dyD []*TensorD, dy unsafe.Pointer,
	dhyD *TensorD, dhy unsafe.Pointer,
	dcyD *TensorD, dcy unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	cxD *TensorD, cx unsafe.Pointer,
	dxD []*TensorD, dx unsafe.Pointer,
	dhxD *TensorD, dhx unsafe.Pointer,
	dcxD *TensorD, dcx unsafe.Pointer,
	wspace unsafe.Pointer, wspacesize uint,
	rspace unsafe.Pointer, rspacesize uint,
) error

RNNBackwardDataUS is like RNNBackwardData but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) RNNForwardInference ¶

func (r *RNND) RNNForwardInference(
	handle *Handle,
	xD []*TensorD, x cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	cxD *TensorD, cx cutil.Mem,
	wD *FilterD, w cutil.Mem,
	yD []*TensorD, y cutil.Mem,
	hyD TensorD, hy cutil.Mem,
	cyD TensorD, cy cutil.Mem,
	wspace cutil.Mem, wspacesize uint,

) error

RNNForwardInference is the forward inference

func (*RNND) RNNForwardInferenceUS ¶

func (r *RNND) RNNForwardInferenceUS(
	handle *Handle,
	xD []*TensorD, x unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	cxD *TensorD, cx unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	yD []*TensorD, y unsafe.Pointer,
	hyD TensorD, hy unsafe.Pointer,
	cyD TensorD, cy unsafe.Pointer,
	wspace unsafe.Pointer, wspacesize uint,

) error

RNNForwardInferenceUS is like RNNForwardInference but uses unsafe.Pointer instead of cutil.Mem

func (*RNND) RNNForwardTraining ¶

func (r *RNND) RNNForwardTraining(
	handle *Handle,
	xD []*TensorD, x cutil.Mem,
	hxD *TensorD, hx cutil.Mem,
	cxD *TensorD, cx cutil.Mem,
	wD *FilterD, w cutil.Mem,
	yD []*TensorD, y cutil.Mem,
	hyD *TensorD, hy cutil.Mem,
	cyD *TensorD, cy cutil.Mem,
	wspace cutil.Mem, wspacesize uint,
	rspace cutil.Mem, rspacesize uint,
) error

RNNForwardTraining is the forward algo for an RNN

func (*RNND) RNNForwardTrainingUS ¶

func (r *RNND) RNNForwardTrainingUS(
	handle *Handle,
	xD []*TensorD, x unsafe.Pointer,
	hxD *TensorD, hx unsafe.Pointer,
	cxD *TensorD, cx unsafe.Pointer,
	wD *FilterD, w unsafe.Pointer,
	yD []*TensorD, y unsafe.Pointer,
	hyD *TensorD, hy unsafe.Pointer,
	cyD *TensorD, cy unsafe.Pointer,
	wspace unsafe.Pointer, wspacesize uint,
	rspace unsafe.Pointer, rspacesize uint,
) error

RNNForwardTrainingUS is like RNNForwardTraining but using unsafe.Pointer instead of cutil.Mem

func (*RNND) Set ¶

func (r *RNND) Set(
	handle *Handle,
	hiddenSize int32,
	numLayers int32,
	doD *DropOutD,
	inputmode RNNInputMode,
	direction DirectionMode,
	rnnmode RNNmode,
	rnnalg RNNAlgo,
	data DataType,

) error

Set sets the rnndesctiptor

func (*RNND) SetAlgorithmDescriptor ¶

func (r *RNND) SetAlgorithmDescriptor(
	handle *Handle,
	algo *AlgorithmD,
) error

SetAlgorithmDescriptor sets the RNNalgorithm

func (*RNND) SetBiasMode ¶

func (r *RNND) SetBiasMode(bmode RNNBiasMode) error

SetBiasMode sets the bias mode for descriptor

func (*RNND) SetClip ¶

func (r *RNND) SetClip(h *Handle, mode RNNClipMode, nanprop NANProp, lclip, rclip float64) error

SetClip sets the clip mode into descriptor

func (*RNND) SetPaddingMode ¶

func (r *RNND) SetPaddingMode(mode RNNPaddingMode) error

SetPaddingMode sets the padding mode with flag passed

func (*RNND) SetProjectionLayers ¶

func (r *RNND) SetProjectionLayers(
	handle *Handle,
	recProjsize int32,
	outProjSize int32,
) error

SetProjectionLayers sets the rnnprojection layers

func (*RNND) SetRNNMatrixMathType ¶

func (r *RNND) SetRNNMatrixMathType(math MathType) error

SetRNNMatrixMathType Sets the math type for the descriptor

type RNNDataD ¶

type RNNDataD struct {
	// contains filtered or unexported fields
}

RNNDataD is a RNNDataDescriptor

func CreateRNNDataD ¶

func CreateRNNDataD() (*RNNDataD, error)

CreateRNNDataD creates an RNNDataD through cudnn's cudnnCreateRNNDataDescriptor This is put into the runtime for GC

func (*RNNDataD) Destroy ¶

func (r *RNNDataD) Destroy() error

Destroy destorys descriptor unless gogc is being used in which it will just return nil

func (*RNNDataD) Get ¶

func (r *RNNDataD) Get() (dtype DataType, layout RNNDataLayout, maxSeqLength, vectorsize int32, seqLengthArray []int32, paddingsymbol float64, err error)

Get gets the parameters used in Set for RNNDataD

func (*RNNDataD) Set ¶

func (r *RNNDataD) Set(dtype DataType, layout RNNDataLayout,
	maxSeqLength, vectorsize int32, seqLengthArray []int32, paddingsymbol float64) error

Set sets the RNNDataD dataType - The datatype of the RNN data tensor. See cudnnDataType_t. layout - The memory layout of the RNN data tensor. maxSeqLength - The maximum sequence length within this RNN data tensor. In the unpacked (padded) layout, this should include the padding vectors in each sequence. In the packed (unpadded) layout, this should be equal to the greatest element in seqLengthArray. vectorSize -The vector length (i.e. embedding size) of the input or output tensor at each timestep. seqLengthArray - An integer array the size of the mini-batch number number of elements. Describes the length (i.e. number of timesteps) of each sequence. Each element in seqLengthArray must be greater than 0 but less than or equal to maxSeqLength. In the packed layout, the elements should be sorted in descending order, similar to the layout required by the non-extended RNN compute functions.

paddingFill - For gocudnn it will auto typecast the value into the correct datatype. Just put the value you want used as an float64.

From Documentation:
A user-defined symbol for filling the padding position in RNN output.
This is only effective when the descriptor is describing the RNN output, and the unpacked layout is specified.
The symbol should be in the host memory, and is interpreted as the same data type as that of the RNN data tensor.

type RNNDataLayout ¶

type RNNDataLayout C.cudnnRNNDataLayout_t

RNNDataLayout are used for flags for data layout

func (*RNNDataLayout) BatchMajorUnPacked ¶

func (r *RNNDataLayout) BatchMajorUnPacked() RNNDataLayout

BatchMajorUnPacked sets r to CUDNN_RNN_DATA_LAYOUT_BATCH_MAJOR_UNPACKED flag

func (*RNNDataLayout) SeqMajorPacked ¶

func (r *RNNDataLayout) SeqMajorPacked() RNNDataLayout

SeqMajorPacked sets r to CUDNN_RNN_DATA_LAYOUT_SEQ_MAJOR_PACKED flag

func (*RNNDataLayout) SeqMajorUnPacked ¶

func (r *RNNDataLayout) SeqMajorUnPacked() RNNDataLayout

SeqMajorUnPacked sets r to and returns CUDNN_RNN_DATA_LAYOUT_SEQ_MAJOR_UNPACKED flag

func (RNNDataLayout) String ¶

func (r RNNDataLayout) String() string

type RNNFlags ¶

type RNNFlags struct {
	Mode      RNNmode
	Algo      RNNAlgo
	Direction DirectionMode
	Input     RNNInputMode
}

RNNFlags holds all the RNN flags

type RNNInputMode ¶

type RNNInputMode C.cudnnRNNInputMode_t

RNNInputMode is used for flags and exposes the different flags through its methods

func (*RNNInputMode) Linear ¶

func (r *RNNInputMode) Linear() RNNInputMode

Linear sets r to and returns RNNInputMode(C.CUDNN_LINEAR_INPUT)

func (*RNNInputMode) Skip ¶

func (r *RNNInputMode) Skip() RNNInputMode

Skip sets r to and returns RNNInputMode(C.CUDNN_SKIP_INPUT)

func (RNNInputMode) String ¶

func (r RNNInputMode) String() string

type RNNPaddingMode ¶

type RNNPaddingMode C.cudnnRNNPaddingMode_t

RNNPaddingMode is the padding mode flag

func (*RNNPaddingMode) Disabled ¶

func (r *RNNPaddingMode) Disabled() RNNPaddingMode

Disabled sets r to and returns RNNPaddingMode(C.CUDNN_RNN_PADDED_IO_DISABLED)

func (*RNNPaddingMode) Enabled ¶

func (r *RNNPaddingMode) Enabled() RNNPaddingMode

Enabled sets r to and returns RNNPaddingMode(C.CUDNN_RNN_PADDED_IO_ENABLED)

func (RNNPaddingMode) String ¶

func (r RNNPaddingMode) String() string

type RNNmode ¶

type RNNmode C.cudnnRNNMode_t

RNNmode is used for flags exposing the flags through methods

func (*RNNmode) Gru ¶

func (r *RNNmode) Gru() RNNmode

Gru sets r to and returns RNNmode(C.CUDNN_GRU)

func (*RNNmode) Lstm ¶

func (r *RNNmode) Lstm() RNNmode

Lstm sets r to and returns RNNmode(C.CUDNN_LSTM)

func (*RNNmode) Relu ¶

func (r *RNNmode) Relu() RNNmode

Relu sets r to and returns RNNMode(C.CUDNN_RNN_RELU)

func (RNNmode) String ¶

func (r RNNmode) String() string

func (*RNNmode) Tanh ¶

func (r *RNNmode) Tanh() RNNmode

Tanh sets r to and returns RNNmode(C.CUDNN_RNN_TANH)

type ReduceTensorD ¶

type ReduceTensorD struct {
	// contains filtered or unexported fields
}

ReduceTensorD is the struct that is used for reduce tensor ops

func CreateReduceTensorDescriptor ¶

func CreateReduceTensorDescriptor() (*ReduceTensorD, error)

CreateReduceTensorDescriptor creates an empry Reduce Tensor Descriptor

func (*ReduceTensorD) Destroy ¶

func (r *ReduceTensorD) Destroy() error

Destroy destroys the reducetensordescriptor

func (*ReduceTensorD) Get ¶

func (r *ReduceTensorD) Get() (reduceop ReduceTensorOp,
	datatype DataType,
	nanprop NANProp,
	reducetensorinds ReduceTensorIndices,
	indicietype IndiciesType, err error)

Get values that were set for r in set

func (*ReduceTensorD) GetIndiciesSize ¶

func (r *ReduceTensorD) GetIndiciesSize(
	handle *Handle,
	aDesc, cDesc *TensorD) (uint, error)

GetIndiciesSize Helper function to return the minimum size in bytes of the index space to be passed to the reduction given the input and output tensors

func (*ReduceTensorD) GetWorkSpaceSize ¶

func (r *ReduceTensorD) GetWorkSpaceSize(
	handle *Handle,
	aDesc, cDesc *TensorD) (uint, error)

GetWorkSpaceSize Helper function to return the minimum size of the workspace to be passed to the reduction given the input and output tensors

func (*ReduceTensorD) ReduceTensorOp ¶

func (r *ReduceTensorD) ReduceTensorOp(
	handle *Handle,
	indices cutil.Mem,
	indiciessize uint,
	wspace cutil.Mem,
	wspacesize uint,
	alpha float64,
	aDesc *TensorD,
	A cutil.Mem,
	beta float64,
	cDesc *TensorD,
	Ce cutil.Mem) error

ReduceTensorOp Tensor operation : C = reduce op( alpha * A ) + beta * C */

The NaN propagation enum applies to only the min and max reduce ops; the other reduce ops propagate NaN as usual.
The indices space is ignored for reduce ops other than min or max.

func (*ReduceTensorD) ReduceTensorOpUS ¶

func (r *ReduceTensorD) ReduceTensorOpUS(
	handle *Handle,
	indices unsafe.Pointer, indiciessize uint,
	wspace unsafe.Pointer, wspacesize uint,
	alpha float64,
	aDesc *TensorD, A unsafe.Pointer,
	beta float64,
	cDesc *TensorD, Ce unsafe.Pointer) error

ReduceTensorOpUS is like ReduceTensorOp but uses unsafe.Pointer instead of cutil.Mem

func (*ReduceTensorD) Set ¶

func (r *ReduceTensorD) Set(reduceop ReduceTensorOp,
	datatype DataType,
	nanprop NANProp,
	reducetensorinds ReduceTensorIndices,
	indicietype IndiciesType) error

Set sets r with the values passed

func (*ReduceTensorD) String ¶

func (r *ReduceTensorD) String() string

String satisfies stringer interface

type ReduceTensorIndices ¶

type ReduceTensorIndices C.cudnnReduceTensorIndices_t

ReduceTensorIndices are used for flags exposed by type's methods

func (*ReduceTensorIndices) FlattenedIndicies ¶

func (r *ReduceTensorIndices) FlattenedIndicies() ReduceTensorIndices

FlattenedIndicies sets r to and returns ReduceTensorIndices(C.CUDNN_REDUCE_TENSOR_FLATTENED_INDICES)

func (*ReduceTensorIndices) NoIndices ¶

func (r *ReduceTensorIndices) NoIndices() ReduceTensorIndices

NoIndices sets r to and returns ReduceTensorIndices(C.CUDNN_REDUCE_TENSOR_NO_INDICES)

func (ReduceTensorIndices) String ¶

func (r ReduceTensorIndices) String() string

String satisfies stringer interface

type ReduceTensorOp ¶

type ReduceTensorOp C.cudnnReduceTensorOp_t

ReduceTensorOp used for flags for reduce tensor functions

func (*ReduceTensorOp) Add ¶

func (r *ReduceTensorOp) Add() ReduceTensorOp

Add sets r to and returns reduceTensorAdd flag

func (*ReduceTensorOp) Amax ¶

func (r *ReduceTensorOp) Amax() ReduceTensorOp

Amax sets r to and returns reduceTensorAmax flag

func (*ReduceTensorOp) Avg ¶

func (r *ReduceTensorOp) Avg() ReduceTensorOp

Avg sets r to and returns reduceTensorAvg flag

func (*ReduceTensorOp) Max ¶

func (r *ReduceTensorOp) Max() ReduceTensorOp

Max sets r to and returns reduceTensorMax flag

func (*ReduceTensorOp) Min ¶

func (r *ReduceTensorOp) Min() ReduceTensorOp

Min sets r to and returns reduceTensorMin flag

func (*ReduceTensorOp) Mul ¶

func (r *ReduceTensorOp) Mul() ReduceTensorOp

Mul sets r to and returns reduceTensorMul flag

func (*ReduceTensorOp) MulNoZeros ¶

func (r *ReduceTensorOp) MulNoZeros() ReduceTensorOp

MulNoZeros sets r to and returns reduceTensorMulNoZeros flag

func (*ReduceTensorOp) Norm1 ¶

func (r *ReduceTensorOp) Norm1() ReduceTensorOp

Norm1 sets r to and returns reduceTensorNorm1 flag

func (*ReduceTensorOp) Norm2 ¶

func (r *ReduceTensorOp) Norm2() ReduceTensorOp

Norm2 sets r to and returns reduceTensorNorm2 flag

func (ReduceTensorOp) String ¶

func (r ReduceTensorOp) String() string

String satisfies stringer interface

type Reorder ¶

type Reorder C.cudnnReorderType_t

Reorder is a flag that is changed through its methods

func (*Reorder) Default ¶

func (r *Reorder) Default() Reorder

Default Sets Reorder for inference

func (*Reorder) NoReorder ¶

func (r *Reorder) NoReorder() Reorder

NoReorder changes the flag to noreorder

func (Reorder) String ¶

func (r Reorder) String() string

type RuntimeTag ¶

type RuntimeTag C.cudnnRuntimeTag_t

RuntimeTag is a type that cudnn looks to check or kernels to see if they are working correctly. Should be used with batchnormialization

type SamplerType ¶

type SamplerType C.cudnnSamplerType_t

SamplerType is used for flags

func (*SamplerType) Bilinear ¶

func (s *SamplerType) Bilinear() SamplerType

Bilinear sets s to SamplerType(C.CUDNN_SAMPLER_BILINEAR) and returns new value of s

func (SamplerType) String ¶

func (s SamplerType) String() string

type SeqDataAxis ¶

type SeqDataAxis C.cudnnSeqDataAxis_t

SeqDataAxis is a flag type setting and returning SeqDataAxis flags through methods Caution: Methods will also change the value of variable that calls the method.

If you need to make a case switch make another variable and call it flag and use that.

func (*SeqDataAxis) Batch ¶

func (s *SeqDataAxis) Batch() SeqDataAxis

Batch -index in batch Method sets type to Batch and returns Batch value

func (*SeqDataAxis) Beam ¶

func (s *SeqDataAxis) Beam() SeqDataAxis

Beam -index in beam Method sets type to Beam and returns Beam value

func (*SeqDataAxis) Time ¶

func (s *SeqDataAxis) Time() SeqDataAxis

Time index in time. Method sets type to Time and returns Time value.

func (*SeqDataAxis) Vect ¶

func (s *SeqDataAxis) Vect() SeqDataAxis

Vect -index in Vector Method sets type to Vect and returns Vect value

type SeqDataD ¶

type SeqDataD struct {
	// contains filtered or unexported fields
}

SeqDataD holds C.cudnnSeqDataDescriptor_t

func CreateSeqDataDescriptor ¶

func CreateSeqDataDescriptor() (*SeqDataD, error)

CreateSeqDataDescriptor creates a new SeqDataD

func (*SeqDataD) Destroy ¶

func (s *SeqDataD) Destroy() error

Destroy will destroy the descriptor For now since everything is on the runtime, and will do nothing

func (*SeqDataD) Get ¶

func (s *SeqDataD) Get() (dtype DataType, dimsA []int32, axes []SeqDataAxis, seqLengthArray []int32, paddingfill float64, err error)

Get gets values used in setting up s

func (*SeqDataD) Set ¶

func (s *SeqDataD) Set(dtype DataType, dimsA []int32, axes []SeqDataAxis, seqLengthArray []int32, paddingfill float64) error

Set - from reading the documentation this is what it seems like how you set it up, and the possible work around with gocudnn.

len(dimsA) && len(axes) needs to equal 4. len(seqLengthArray) needs to be < dimsA[(*seqDataAxis).Time()]

dimsA - contains the dims of the buffer that holds a batch of sequence samples. all vals need to be positive.

dimsA[(*seqDataAxis).Time()]=is the maximum allowed sequence length

dimsA[(*seqDataAxis).Batch()]= is the maximum allowed batch size

dimsA[(*seqDataAxis).Beam()]= is the number of beam in each sample

dimsA[(*seqDataAxis).Vect()]= is the vector length.

axes- order in which the axes are in. Needs to be in order of outermost to inner most. Kind of like an NCHW tensor where N is the outer and w is the inner.

Example:

var s SeqDataAxis

axes:=[]SeqDataAxis{s.Batch(), s.Time(),s.Beam(),s.Vect()}

seqLengthArray - Array that holds the sequence lengths of each sequence. paddingfill - Points to a value, of dataType, that is used to fill up the buffer beyond the sequence length of each sequence. The only supported value for paddingFill is 0. paddingfill is autoconverted to the datatype that it needs in the function

type SoftMaxAlgorithm ¶

type SoftMaxAlgorithm C.cudnnSoftmaxAlgorithm_t

SoftMaxAlgorithm is used for flags and are exposed through its methods

func (*SoftMaxAlgorithm) Accurate ¶

func (s *SoftMaxAlgorithm) Accurate() SoftMaxAlgorithm

Accurate changes s to and returns SoftMaxAlgorithm(C.CUDNN_SOFTMAX_ACCURATE)

func (*SoftMaxAlgorithm) Fast ¶

func (s *SoftMaxAlgorithm) Fast() SoftMaxAlgorithm

Fast changes s to and returns SoftMaxAlgorithm(C.CUDNN_SOFTMAX_FAST)

func (*SoftMaxAlgorithm) Log ¶

func (s *SoftMaxAlgorithm) Log() SoftMaxAlgorithm

Log changes s to and returns SoftMaxAlgorithm(C.CUDNN_SOFTMAX_LOG)

func (SoftMaxAlgorithm) String ¶

func (s SoftMaxAlgorithm) String() string

type SoftMaxD ¶

type SoftMaxD struct {
	// contains filtered or unexported fields
}

SoftMaxD holds the soft max flags and soft max funcs

func CreateSoftMaxDescriptor ¶

func CreateSoftMaxDescriptor() *SoftMaxD

CreateSoftMaxDescriptor creates a gocudnn softmax descriptor. It is not part of cudnn, but I wanted to make the library A little more stream lined after using it for a while

func (*SoftMaxD) Backward ¶

func (s *SoftMaxD) Backward(
	handle *Handle,
	alpha float64,
	yD *TensorD, y cutil.Mem,
	dyD *TensorD, dy cutil.Mem,
	beta float64,
	dxD *TensorD, dx cutil.Mem,
) error

Backward performs the backward softmax

Input/Output: dx

func (*SoftMaxD) BackwardUS ¶

func (s *SoftMaxD) BackwardUS(
	handle *Handle,
	alpha float64,
	yD *TensorD, y unsafe.Pointer,
	dyD *TensorD, dy unsafe.Pointer,
	beta float64,
	dxD *TensorD, dx unsafe.Pointer,
) error

BackwardUS is like Backward but uses unsafe.Pointer instead of cutil.Mem

func (*SoftMaxD) Forward ¶

func (s *SoftMaxD) Forward(
	handle *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	beta float64,
	yD *TensorD, y cutil.Mem) error

Forward performs forward softmax

Input/Output: y

func (*SoftMaxD) ForwardUS ¶

func (s *SoftMaxD) ForwardUS(
	handle *Handle,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	beta float64,
	yD *TensorD, y unsafe.Pointer) error

ForwardUS is like Forward but uses unsafe.Pointer instead of cutil.Mem

func (*SoftMaxD) Get ¶

func (s *SoftMaxD) Get() (algo SoftMaxAlgorithm, mode SoftMaxMode, err error)

Get gets the softmax descriptor values

func (*SoftMaxD) Set ¶

func (s *SoftMaxD) Set(algo SoftMaxAlgorithm, mode SoftMaxMode) error

Set sets the soft max algos.

func (*SoftMaxD) String ¶

func (s *SoftMaxD) String() string

type SoftMaxMode ¶

type SoftMaxMode C.cudnnSoftmaxMode_t

SoftMaxMode is used for softmaxmode flags and are exposed through its methods

func (*SoftMaxMode) Channel ¶

func (s *SoftMaxMode) Channel() SoftMaxMode

Channel changes s to SoftMaxMode(C.CUDNN_SOFTMAX_MODE_CHANNEL) and returns changed value

func (*SoftMaxMode) Instance ¶

func (s *SoftMaxMode) Instance() SoftMaxMode

Instance changes s to SoftMaxMode(C.CUDNN_SOFTMAX_MODE_INSTANCE) and returns changed value

func (SoftMaxMode) String ¶

func (s SoftMaxMode) String() string

type SpatialTransformerD ¶

type SpatialTransformerD struct {
	// contains filtered or unexported fields
}

SpatialTransformerD holdes the spatial descriptor

func CreateSpatialTransformerDescriptor ¶

func CreateSpatialTransformerDescriptor() (*SpatialTransformerD, error)

CreateSpatialTransformerDescriptor creates the spacial tesnor

func (*SpatialTransformerD) Destroy ¶

func (s *SpatialTransformerD) Destroy() error

Destroy destroys the spatial Transformer Desctiptor. If GC is enable this function won't delete transformer. It will only return nil Since gc is automatically enabled this function is not functional.

func (*SpatialTransformerD) GridGeneratorBackward ¶

func (s *SpatialTransformerD) GridGeneratorBackward(
	handle *Handle,
	grid cutil.Mem,
	theta cutil.Mem,
) error

GridGeneratorBackward - This function generates a grid of coordinates in the input tensor corresponding to each pixel from the output tensor.

func (*SpatialTransformerD) GridGeneratorBackwardUS ¶

func (s *SpatialTransformerD) GridGeneratorBackwardUS(
	handle *Handle,
	grid unsafe.Pointer,
	theta unsafe.Pointer,
) error

GridGeneratorBackwardUS is like GridGeneratorBackward but uses unsafe.Pointer instead of cutil.Mem

func (*SpatialTransformerD) GridGeneratorForward ¶

func (s *SpatialTransformerD) GridGeneratorForward(
	handle *Handle,
	theta cutil.Mem,
	grid cutil.Mem,

) error

GridGeneratorForward This function generates a grid of coordinates in the input tensor corresponding to each pixel from the output tensor.

func (*SpatialTransformerD) GridGeneratorForwardUS ¶

func (s *SpatialTransformerD) GridGeneratorForwardUS(
	handle *Handle,
	theta unsafe.Pointer,
	grid unsafe.Pointer,

) error

GridGeneratorForwardUS is like GridGeneratorForward but uses unsafe.Pointer instead of cutil.Mem

func (*SpatialTransformerD) SamplerBackward ¶

func (s *SpatialTransformerD) SamplerBackward(
	handle *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	beta float64,
	dxD *TensorD, dx cutil.Mem,
	alphaDgrid float64,
	dyD *TensorD, dy cutil.Mem,
	grid cutil.Mem,
	betaDgrid float64,
	dGrid cutil.Mem,
) error

SamplerBackward does the spatial Tranform Sample Backward

func (*SpatialTransformerD) SamplerBackwardUS ¶

func (s *SpatialTransformerD) SamplerBackwardUS(
	handle *Handle,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	beta float64,
	dxD *TensorD, dx unsafe.Pointer,
	alphaDgrid float64,
	dyD *TensorD, dy unsafe.Pointer,
	grid unsafe.Pointer,
	betaDgrid float64,
	dGrid unsafe.Pointer,
) error

SamplerBackwardUS is like SamplerBackward but uses unsafe.Pointer instead of cutil.Mem

func (*SpatialTransformerD) SamplerForward ¶

func (s *SpatialTransformerD) SamplerForward(
	handle *Handle,
	alpha float64,
	xD *TensorD, x cutil.Mem,
	grid cutil.Mem,
	beta float64,
	yD *TensorD, y cutil.Mem,
) error

SamplerForward performs the spatialtfsampleforward

func (*SpatialTransformerD) SamplerForwardUS ¶

func (s *SpatialTransformerD) SamplerForwardUS(
	handle *Handle,
	alpha float64,
	xD *TensorD, x unsafe.Pointer,
	grid unsafe.Pointer,
	beta float64,
	yD *TensorD, y unsafe.Pointer,
) error

SamplerForwardUS is like SamplerForward but uses unsafe.Pointer instead of cutil.Mem

func (*SpatialTransformerD) Set ¶

func (s *SpatialTransformerD) Set(sampler SamplerType, data DataType, dimA []int32) error

Set sets spacial to nd descriptor.

type Status ¶

type Status C.cudnnStatus_t

Status is the status of the cuda dnn

const StatusSuccess Status = 0

StatusSuccess is the zero error of Status. None of the other flags are visable for now, of the Status.error() method

func WrapErrorWithStatus ¶

func WrapErrorWithStatus(e error) (Status, error)

WrapErrorWithStatus if the error string contains a cudnnStatus_t string then it will return the Status and nil, if it doens't the Status will be the flag for CUDNN_STATUS_RUNTIME_FP_OVERFLOW but the error will not return a nil

func (Status) Error ¶

func (status Status) Error(comment string) error

func (Status) String ¶

func (status Status) String() string

String is the function that makes a human readable message

type TensorD ¶

type TensorD struct {
	// contains filtered or unexported fields
}

TensorD holds the cudnnTensorDescriptor. Which is basically the tensor itself

Example ¶

ExampleTensorD shows tomake a tensor

package main

import (
	"fmt"
	"runtime"

	"github.com/dereklstinson/gocudnn/gocu"

	"github.com/dereklstinson/gocudnn/cudart"

	gocudnn "github.com/dereklstinson/gocudnn"
)

func main() {
	//Need to lock os thread.
	runtime.LockOSThread()
	check := func(e error) {
		if e != nil {
			panic(e)
		}
	}
	//Creating a blocking stream
	cs, err := cudart.CreateBlockingStream()
	check(err)
	//Create Device
	dev := cudart.CreateDevice(1)

	//Make an Allocator
	worker := gocu.NewWorker(dev)
	CudaMemManager, err := cudart.CreateMemManager(worker) //cs could be nil .  Check out cudart package on more about streams
	check(err)

	//Tensor
	var tflg gocudnn.TensorFormat //Flag for tensor
	var dtflg gocudnn.DataType    //Flag for tensor

	xD, err := gocudnn.CreateTensorDescriptor()

	// Setting Tensor
	err = xD.Set(tflg.NCHW(), dtflg.Float(), []int32{20, 1, 1, 1}, nil)
	check(err)

	//Gets SIB for tensor memory on device
	xSIB, err := xD.GetSizeInBytes()
	check(err)

	//Allocating memory to device and returning pointer to device memory
	x, err := CudaMemManager.Malloc(xSIB)

	//Create some host mem to copy to cuda memory
	hostmem := make([]float32, xSIB/4)
	//You can fill it
	for i := range hostmem {
		hostmem[i] = float32(i)
	}
	//Convert the slice to GoMem
	hostptr, err := gocu.MakeGoMem(hostmem)

	//Copy hostmem to x
	CudaMemManager.Copy(x, hostptr, xSIB) // This allocotor syncs the cuda stream after every copy.
	// You can make your own custom one. This was a default one
	// to help others get going. Some "extra" functions beyond the api
	// require an allocator.

	//if not using an allocator sync the stream before changing the host mem right after a mem copy.  It could cause problems.
	err = cs.Sync()
	check(err)

	//Zero out the golang host mem.
	for i := range hostmem {
		hostmem[i] = float32(0)
	}

	//do some tensor stuff can return vals to host mem by doing another copy
	err = CudaMemManager.Copy(hostptr, x, xSIB)

	check(err)
	fmt.Println(hostmem)
}

Output:

[0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]

func CreateTensorDescriptor ¶

func CreateTensorDescriptor() (*TensorD, error)

CreateTensorDescriptor creates an empty tensor descriptor

func (*TensorD) DataType ¶

func (t *TensorD) DataType() DataType

DataType returns the datatype of the tensor

func (*TensorD) Destroy ¶

func (t *TensorD) Destroy() error

Destroy destroys the tensor. In future I am going to add a GC setting that will enable or disable the GC. When the GC is disabled It will allow the user more control over memory. right now it does nothing and returns nil

func (*TensorD) Dims ¶

func (t *TensorD) Dims() []int32

Dims returns the shape of the tensor

func (*TensorD) Format ¶

func (t *TensorD) Format() TensorFormat

Format returns the tensor format

func (*TensorD) Get ¶

func (t *TensorD) Get() (frmt TensorFormat, dtype DataType, shape []int32, stride []int32, err error)

Get returns Data Type the Dims for shape and stride and error. for Descriptors without stride it will still return junk info. so be mindful when you code.

func (*TensorD) GetSizeInBytes ¶

func (t *TensorD) GetSizeInBytes() (uint, error)

GetSizeInBytes returns the SizeT in bytes and Status

func (*TensorD) Set ¶

func (t *TensorD) Set(frmt TensorFormat, data DataType, shape, stride []int32) error

Set sets the tensor accourding to the values passed. This is all different than how cudnn does it. In cudnn stride dictates the format of the tensor. Here it will be different. if format is Unknown then strides will dictate the format. If NHWC is chosen then gocudnn will swap things around to make TensorD behave more like FilterD

	Basic 4D formats:

	NCHW:

		  shape[0] = # of batches
		  shape[1] = # of channels
		  shape[2] = height
		  shape[3] = width

	NHWC:

		  shape[0] = # of batches
		  shape[1] = height
		  shape[2] = width
		  shape[3] = # of channels

	Strided:

	Strided is kind of hard to explain.  So here is an example of how values would be placed.
	n, c, h, w := 3,3,256,256 //Here is a batch of 3 images using rgb the size of 256x256
	dims := []int{n, c, h, w}  // Here we have the dims set.
	chw := c * h * w
	hw := h * w
	stride := []int{chw, hw, w, 1}  //This is how stride is usually set.
 //If you wanted to get or place a value at a certain location.
	//Such as:
	//func GetValue(tensor []float32, location [4]int, stride [4]int){
	//l,s:=location,stride
	//return tensor[(l[0]*s[0])+(l[1]*s[1])+(l[2]*s[2])+(l[3]*s[3])] //As you can see the stride changes where you look in the tensor.
	//}

	Notes:

	1) The total size of a tensor including the potential padding between dimensions is limited to 2 Giga-elements of type datatype.
	   Tensors are restricted to having at least 4 dimensions, and at most DimMax (a const with val of 8 at the time of writing this) dimensions.
    When working with lower dimensional data, it is recommended that the user create a 4D tensor, and set the size along unused dimensions to 1.
	2) Stride is ignored if frmt is set to frmt.Strided(). So it can be set to nil.

func (*TensorD) String ¶

func (t *TensorD) String() string

type TensorFormat ¶

type TensorFormat C.cudnnTensorFormat_t

TensorFormat is the type used for flags to set tensor format. Type contains methods that change the value of the type. Caution: Methods will also change the value of variable that calls the method.

If you need to make a case switch make another variable and call it flag and use that.  Look at ToString.

Semi-Custom gocudnn flag. NCHW,NHWC,NCHWvectC come from cudnn. gocudnn adds Strided, and Unknown Reasonings -- Strided - When the tensor is set with strides there is no TensorFormat flag passed. Also cudnnGetTensor4dDescriptor,and cudnnGetTensorNdDescriptor doesn't return the tensor format. Which is really annoying. gocudnn will hide this flag in TensorD so that it can be returned with the tensor. Unknown--Was was made because with at least with the new AttentionD in cudnn V7.5 it will make a descriptor for you. IDK what the tensor format will be. So lets not make an (ASSUME) and mark it with this.

func (*TensorFormat) NCHW ¶

func (t *TensorFormat) NCHW() TensorFormat

NCHW return TensorFormat(C.CUDNN_TENSOR_NCHW) Method sets type and returns new value.

func (*TensorFormat) NCHWvectC ¶

func (t *TensorFormat) NCHWvectC() TensorFormat

NCHWvectC return TensorFormat(C.CUDNN_TENSOR_NCHW_VECT_C) Method sets type and returns new value.

func (*TensorFormat) NHWC ¶

func (t *TensorFormat) NHWC() TensorFormat

NHWC return TensorFormat(C.CUDNN_TENSOR_NHWC) Method sets type and returns new value.

func (TensorFormat) String ¶

func (t TensorFormat) String() string

ToString will return a human readable string that can be printed for debugging.

func (*TensorFormat) Unknown ¶

func (t *TensorFormat) Unknown() TensorFormat

Unknown returns TensorFormat(128). This is custom gocudnn flag. Read TensorFormat notes for explanation. Method sets type and returns new value.

type TransformD ¶

type TransformD struct {
	// contains filtered or unexported fields
}

TransformD holds the transform tensor descriptor

func CreateTransformDescriptor ¶

func CreateTransformDescriptor() (*TransformD, error)

CreateTransformDescriptor creates a transform descriptor

Needs to be Set with Set method.

func (*TransformD) Destroy ¶

func (t *TransformD) Destroy() error

Destroy will destroy tensor if not using GC, but if GC is used then it will do nothing

func (*TransformD) Get ¶

func (t *TransformD) Get() (destFormat TensorFormat, padBefore, padAfter []int32, foldA []uint32, direction FoldingDirection, err error)

Get gets the values of the transform descriptor

func (*TransformD) InitDest ¶

func (t *TransformD) InitDest(src *TensorD) (dest *TensorD, destsib uint, err error)

InitDest This function initializes and returns a destination tensor descriptor destDesc for tensor transform operations. The initialization is done with the desired parameters described in the transform descriptor TensorD. Note: The returned tensor descriptor will be packed.

func (*TransformD) Set ¶

func (t *TransformD) Set(nbDims uint32, destFormat TensorFormat, padBefore, padAfter []int32, foldA []uint32, direction FoldingDirection) error

Set sets the TransformD

padBefore,padAfter,FoldA can be nil if not using any one of those Custom flags for gocudnn added custom flags for TensorFormat will cause an error

func (*TransformD) String ¶

func (t *TransformD) String() string

func (*TransformD) TransformFilter ¶

func (t *TransformD) TransformFilter(h *Handle, alpha float64, srcD *FilterD, src cutil.Mem, beta float64, destD *FilterD, dest cutil.Mem) error

TransformFilter performs transform on filter

func (*TransformD) TransformTensor ¶

func (t *TransformD) TransformTensor(h *Handle, alpha float64, srcD *TensorD, src cutil.Mem, beta float64, destD *TensorD, dest cutil.Mem) error

TransformTensor transforms a tensor according to how TransformD was set

func (*TransformD) TransformTensorUS ¶

func (t *TransformD) TransformTensorUS(h *Handle, alpha float64, srcD *TensorD, src unsafe.Pointer, beta float64, destD *TensorD, dest unsafe.Pointer) error

TransformTensorUS is like TransformTensor but uses unsafe.Pointer instead of cutil.Mem

type WgradMode ¶

type WgradMode C.cudnnWgradMode_t

WgradMode is used for flags and can be changed through methods

func (*WgradMode) Add ¶

func (w *WgradMode) Add() WgradMode

Add sets w to Add and returns Add flag

func (*WgradMode) Set ¶

func (w *WgradMode) Set() WgradMode

Set sets w to Set and returns Set flag

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
cublas Package cublas - Blas functions for cuda gpus.	Package cublas - Blas functions for cuda gpus.
cuda
cudart
crtutil Package crtutil allows cudart to work with Go's io Reader and Writer interfaces.	Package crtutil allows cudart to work with Go's io Reader and Writer interfaces.
curand
gocu Package gocu contains common interfaces to allow the different cuda packages/libraries to intermix with each other and with go.	Package gocu contains common interfaces to allow the different cuda packages/libraries to intermix with each other and with go.
kernels
kernexper
npp
nvjpeg
nvjpegexample
nvrtc
test
xtra Package xtra is just some functions that use cuda and kernels to make functions that I use that are useful in deep learning.	Package xtra is just some functions that use cuda and kernels to make functions that I use that are useful in deep learning.
test
test/concat
xtrakerns

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL