Documentation ¶
Index ¶
- Constants
- func CreateModule(k Kernel, dev cuda.Device) (*cuda.Module, error)
- type Kernel
- func AdaDelta() Kernel
- func AdaDeltaFP16() Kernel
- func AdaGrad() Kernel
- func AdaGradFP16() Kernel
- func Adam() Kernel
- func AdamFP16() Kernel
- func AllKerns() (all []Kernel)
- func ConcatBackwardNCHW() Kernel
- func ConcatBackwardNCHWFP16() Kernel
- func ConcatForwardNCHW() Kernel
- func ConcatForwardNCHWFP16() Kernel
- func L1L2() Kernel
- func L1L2FP16() Kernel
- func LeakyBackward() Kernel
- func LeakyBackwardAlpha() Kernel
- func LeakyBackwardAlphaBeta() Kernel
- func LeakyBackwardAlphaBetaFP16() Kernel
- func LeakyBackwardAlphaFP16() Kernel
- func LeakyBackwardFP16() Kernel
- func LeakyForward() Kernel
- func LeakyForwardAlpha() Kernel
- func LeakyForwardAlphaBeta() Kernel
- func LeakyForwardAlphaBetaFP16() Kernel
- func LeakyForwardAlphaFP16() Kernel
- func LeakyForwardFP16() Kernel
- func MSELoss() Kernel
- func MSELossFP16() Kernel
- func MSELossbyBatches() Kernel
- func MSELossbyBatchesFP16() Kernel
- func MakePlanarImageBatchesUint8() Kernel
- func NearestNeighborNCHW() Kernel
- func NearestNeighborNCHWBack() Kernel
- func NearestNeighborNCHWBackFP16() Kernel
- func NearestNeighborNCHWFP16() Kernel
- func NearestNeighborNHWC() Kernel
- func NearestNeighborNHWCBack() Kernel
- func NearestNeighborNHWCBackFP16() Kernel
- func NearestNeighborNHWCFP16() Kernel
- func PreluBackward() Kernel
- func PreluBackwardFP16() Kernel
- func PreluForward() Kernel
- func PreluForwardFP16() Kernel
- func Segment1stDim() Kernel
- func Segment1stDimFP16() Kernel
- func ShapeToBatch4DNHWC() Kernel
- func ShapeToBatch4DNHWCFP16() Kernel
- func ShapetoBatch4DNCHW() Kernel
- func ShapetoBatch4DNCHWFP16() Kernel
- func SwapEveryOther() Kernel
- func SwapEveryOtherFP16() Kernel
- func SwapUpperLower() Kernel
- func SwapUpperLowerFP16() Kernel
- func ThreshBackward() Kernel
- func ThreshBackwardFP16() Kernel
- func ThreshForward() Kernel
- func ThreshForwardFP16() Kernel
- func Transpose() Kernel
- func TransposeFP16() Kernel
Constants ¶
const Defines = `` /* 445-byte string literal not displayed */
Defines are used for all kernels
const Headers = `
#include <cuda.h>
#include <stdbool.h>
#include <cuda_fp16.h>
`
Headers are used for all kernels
Variables ¶
This section is empty.
Functions ¶
Types ¶
type Kernel ¶
Kernel is used to build kernels
func ConcatBackwardNCHW ¶
func ConcatBackwardNCHW() Kernel
ConcatBackwardNCHW is a concat for NCHW that hasn't been tested.
func ConcatBackwardNCHWFP16 ¶
func ConcatBackwardNCHWFP16() Kernel
ConcatBackwardNCHWhalf for concat half
func ConcatForwardNCHW ¶
func ConcatForwardNCHW() Kernel
func ConcatForwardNHWC()Kernel{ return Kernel{ Name:`ConcatForwardHWC`, Code:`extern "C" __global__ void ConcatForwardNHWC(const int XThreads, const int YThreads, const int Batch, const int nsrcs, const float **srcs const int **srcchansize, const int destchansize, float* Dest){ CUDA_GRID_LOOP_AXIS(i, YThreads,y){ CUDA_GRID_LOOP_AXIS(j, XThreads,x){ } }` } }
ConcatForwardNCHW is a concat for NCHW that hasn't been tested.
func ConcatForwardNCHWFP16 ¶
func ConcatForwardNCHWFP16() Kernel
ConcatForwardNCHWhalf is concat func in halfs
func LeakyBackwardAlpha ¶
func LeakyBackwardAlpha() Kernel
LeakyBackwardAlpha --backwards function for forward
func LeakyBackwardAlphaBeta ¶
func LeakyBackwardAlphaBeta() Kernel
LeakyBackwardAlphaBeta --backwards function for forward
func LeakyBackwardAlphaBetaFP16 ¶
func LeakyBackwardAlphaBetaFP16() Kernel
LeakyBackwardAlphaBetaFP16 --backwards function for forward
func LeakyBackwardAlphaFP16 ¶
func LeakyBackwardAlphaFP16() Kernel
LeakyBackwardAlphaFP16 --backwards function for forward
func LeakyBackwardFP16 ¶
func LeakyBackwardFP16() Kernel
LeakyBackwardFP16 --backwards function for forward
func LeakyForwardAlphaBeta ¶
func LeakyForwardAlphaBeta() Kernel
LeakyForwardAlphaBeta - is the leaky activation
func LeakyForwardAlphaBetaFP16 ¶
func LeakyForwardAlphaBetaFP16() Kernel
LeakyForwardAlphaBetaFP16 is the leaky activation
func LeakyForwardAlphaFP16 ¶
func LeakyForwardAlphaFP16() Kernel
LeakyForwardAlphaFP16 is a leaky function
func MSELossFP16 ¶
func MSELossFP16() Kernel
MSELossFP16 performs the mean squared error loss function
func MSELossbyBatches ¶
func MSELossbyBatches() Kernel
MSELossbyBatches performs the mean squared error loss function by batches Good for gans
func MSELossbyBatchesFP16 ¶
func MSELossbyBatchesFP16() Kernel
MSELossbyBatchesFP16 performs the mean squared error loss function by batches Good for gans
func MakePlanarImageBatchesUint8 ¶
func MakePlanarImageBatchesUint8() Kernel
MakePlanarImageBatchesUint8 - for this to work all the each batch should have the same amount of channels and all the channels need to be the same size
func NearestNeighborNCHW ¶
func NearestNeighborNCHW() Kernel
NearestNeighborNCHW is a nearest neighbor resize function
func NearestNeighborNCHWBack ¶
func NearestNeighborNCHWBack() Kernel
NearestNeighborNCHWBack is a nearest neighbor resize function
func NearestNeighborNCHWBackFP16 ¶
func NearestNeighborNCHWBackFP16() Kernel
NearestNeighborNCHWBackFP16 is a nearest neighbor resize function
func NearestNeighborNCHWFP16 ¶
func NearestNeighborNCHWFP16() Kernel
NearestNeighborNCHWFP16 is a nearest neighbor resize function
func NearestNeighborNHWC ¶
func NearestNeighborNHWC() Kernel
NearestNeighborNHWC is a nearest neighbor resize function
func NearestNeighborNHWCBack ¶
func NearestNeighborNHWCBack() Kernel
NearestNeighborNHWCBack is a nearest neighbor resize function
func NearestNeighborNHWCBackFP16 ¶
func NearestNeighborNHWCBackFP16() Kernel
NearestNeighborNHWCBackFP16 is a nearest neighbor resize function
func NearestNeighborNHWCFP16 ¶
func NearestNeighborNHWCFP16() Kernel
NearestNeighborNHWCFP16 is a nearest neighbor resize function
func PreluBackwardFP16 ¶
func PreluBackwardFP16() Kernel
PreluBackwardFP16 --backwards function for forward
func Segment1stDim ¶
func Segment1stDim() Kernel
Segment1stDim -- is paired with the host -- it segments the first dim of a tensor
func Segment1stDimFP16 ¶
func Segment1stDimFP16() Kernel
Segment1stDimFP16 -- is paired with the host -- it segments the first dim of a tensor
func ShapeToBatch4DNHWC ¶
func ShapeToBatch4DNHWC() Kernel
ShapeToBatch4DNHWC Does a stride shape to batch. Make sure values on receiving end are set to zero when s2b is 0
func ShapeToBatch4DNHWCFP16 ¶
func ShapeToBatch4DNHWCFP16() Kernel
ShapeToBatch4DNHWCFP16 Does a stride shape to batch. Make sure values on receiving end are set to zero when s2b is 0
func ShapetoBatch4DNCHW ¶
func ShapetoBatch4DNCHW() Kernel
ShapetoBatch4DNCHW Does a stride shape to batch. Make sure values on receiving end are set to zero when s2b is 0
func ShapetoBatch4DNCHWFP16 ¶
func ShapetoBatch4DNCHWFP16() Kernel
ShapetoBatch4DNCHWFP16 is like ShapetoBatch4DNCHW
func SwapEveryOther ¶
func SwapEveryOther() Kernel
SwapEveryOther will swap the batches between 2 tensors. It will be either the even or the odd. Both tensors have to be equal in size and dims. if even is >0 then it will do the even batches. Make sure labels are swapped on host end.
func SwapEveryOtherFP16 ¶
func SwapEveryOtherFP16() Kernel
SwapEveryOtherFP16 will swap the batches between 2 tensors. It will be either the even or the odd. Both tensors have to be equal in size and dims. if even is >0 then it will do the even batches. Make sure labels are swapped on host end.
func SwapUpperLower ¶
func SwapUpperLower() Kernel
SwapUpperLower will swap either the upper or lower batches Right Now inverse doesn't do anything
func SwapUpperLowerFP16 ¶
func SwapUpperLowerFP16() Kernel
SwapUpperLowerFP16 is like the FP32 version
func ThreshBackwardFP16 ¶
func ThreshBackwardFP16() Kernel
ThreshBackwardFP16 --backwards function for forward
func ThreshForward ¶
func ThreshForward() Kernel
ThreshForward is kind of memory expensive, mostly because it is experimental. To test start the positive at random uniform numbers between .9 and 1.1 and do the negcoefs between .01 and .2 or something along those lines. maybe the threshold should be between -.3 and .3 uniform number
func ThreshForwardFP16 ¶
func ThreshForwardFP16() Kernel
ThreshForwardFP16 is kind of memory expensive, mostly because it is experimental. To test start the positive at random uniform numbers between .9 and 1.1 and do the negcoefs between .01 and .2 or something along those lines. maybe the threshold should be between -.3 and .3 uniform number