activation

package
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 15, 2024 License: MIT Imports: 3 Imported by: 0

Documentation

Overview

Package activation containts the list of all activation functions for layers of the neural network.

Activation fucnton are applied to the output of the layer (i.e. W*X + b) to enhance its prediction capability.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type ActivationFunction

type ActivationFunction interface {
	Apply(float64) float64
	ApplyMatrix(Matrix[float64])

	Derivative(float64) float64
	DerivativeMatrix(Matrix[float64]) Matrix[float64]
}

ActivationFunction is the interface for the output for the entire layer in a neural network.

Apply applies the function to a scalar. As some activation functions are vector functions, this method may return NaN as a placeholder.

ApplyMatrix applies the function to a nx1 matrix. Depending on the function, the function can be applied individually to each element, which would be equivalent to calling Apply() to each element, or apply to the whole matrix, using multiple elements to compute the value. This method modifies the given matrix in-place, which may result in change of dimensions of the matrix.

Derivative produces a derivative with respect to the input of the activation function.

DerivativeMatrix produces a derivative matrix. As some activation functions are vector functions, this function may use the whole matrix.

func DynamicActivation

func DynamicActivation(activationName string) (ActivationFunction, error)

DynamicActivation returns the activation function based on the name. Identical to importing and initializing the activation function directly.

type Linear

type Linear struct{}

Linear is a linear activation function, which does not modify the input.

Linear(x) = x
dLinear/dx = 1

func (Linear) Apply

func (l Linear) Apply(x float64) float64

func (Linear) ApplyMatrix

func (l Linear) ApplyMatrix(M Matrix[float64])

func (Linear) Derivative

func (l Linear) Derivative(x float64) float64

func (Linear) DerivativeMatrix

func (l Linear) DerivativeMatrix(m Matrix[float64]) Matrix[float64]

type ReLU

type ReLU struct{}

ReLU, or Rectified Linear Unit, is an activation function which works similarly to the actual neuron, either suppresses the value completely, or propagates it further.

ReLU(x) = max(x, 0)
dReLU/dx = 1 if x >= 0 else 0

Computation of derivative in such manner is mathematically incorrect, as ReLU is not differentiable at x = 0, but in practice either 1 or 0 (1 in case of this implementation) is used.

func (ReLU) Apply

func (r ReLU) Apply(x float64) float64

func (ReLU) ApplyMatrix

func (r ReLU) ApplyMatrix(M Matrix[float64])

func (ReLU) Derivative

func (r ReLU) Derivative(x float64) float64

func (ReLU) DerivativeMatrix

func (r ReLU) DerivativeMatrix(M Matrix[float64]) Matrix[float64]

type SELU

type SELU struct{}

Scaled Exponential Linear Units, or SELUs, are activation functions that induce self-normalizing properties.

SELU(x) = λ*x if x >= 0 else λ*α*(exp(x) - 1)
dSELU/dx = λ if x >= 0 else λ*α*exp(x)

Where:

λ ≈ 1.0507, α ≈ 1.6733

func (SELU) Apply

func (s SELU) Apply(x float64) float64

func (SELU) ApplyMatrix

func (s SELU) ApplyMatrix(m Matrix[float64])

func (SELU) Derivative

func (s SELU) Derivative(x float64) float64

func (SELU) DerivativeMatrix

func (s SELU) DerivativeMatrix(m Matrix[float64]) Matrix[float64]

type Sigmoid

type Sigmoid struct{}

Sigmoid is a continuous non-linear activation function which maps the rational numbers to [0;1] range.

Sigmoid(x) = 1 / (1 + exp(-x))
dSigmoid/dx = Sigmoid(x) * (1 - Sigmoid(x))

func (Sigmoid) Apply

func (s Sigmoid) Apply(z float64) float64

func (Sigmoid) ApplyMatrix

func (s Sigmoid) ApplyMatrix(M matrix.Matrix[float64])

func (Sigmoid) Derivative

func (s Sigmoid) Derivative(x float64) float64

func (Sigmoid) DerivativeMatrix

func (s Sigmoid) DerivativeMatrix(M matrix.Matrix[float64]) matrix.Matrix[float64]

type Softmax

type Softmax struct{}

Softmax is a probability-generative vector activation function, i.e. generates probabilities from the input matrix. The number of classes is the row count of the input matrix, and columns are treated as separate outputs of the batch.

Softmax(x)_j = exp(x_j) / sum(x_i, i ∈ [1, classCount])

As Softmax is a *vector* function, Apply() and Derivative() methods return NaN.

Due to implementation of the matrix class, and the fact that dSoftmax/dx returns a Jacobian matrix (i.e. a 3D-tensor for a list of vecotrs), the DerivativeMatrix() is not implemented and has to be overriden.

func (Softmax) Apply

func (s Softmax) Apply(z float64) float64

func (Softmax) ApplyMatrix

func (s Softmax) ApplyMatrix(M Matrix[float64])

func (Softmax) Derivative

func (s Softmax) Derivative(x float64) float64

func (Softmax) DerivativeMatrix

func (s Softmax) DerivativeMatrix(M Matrix[float64]) Matrix[float64]

type SoftmaxWithCCE

type SoftmaxWithCCE struct {
	Softmax
}

SoftmaxWithCCE is an activation function, which is used only together with categorical cross-entropy loss. The main difference between this function and Softmax is that it does not produce a proper derivative, but instead relies fully on the derivative of categorical cross-entropy loss function.

IMPORTANT: should be only used with CategoricalCrossEntropyLossWithSoftmax loss function!

As SoftmaxWithCCE is a *vector* function, Apply() and Derivative() methods return NaN.

func (SoftmaxWithCCE) DerivativeMatrix

func (s SoftmaxWithCCE) DerivativeMatrix(M Matrix[float64]) Matrix[float64]

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL