weight

package module
v0.0.0-...-32ffa8f Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 5, 2017 License: GPL-3.0 Imports: 1 Imported by: 0

README

Weight build

Weight is a neural network library written in Go that focuses on portability and ease of use.

The library has three main parts:

  • The Tensor struct
  • Interface definitions
  • Interface implementations

Appart from the Tensor struct, everything is an interface, so you can extend/replace the default implementation with your own.

Usage

Let's see how to classify digits from the MNIST dataset using this library:

The first thing we need is a layer. A layer takes a tensor and returns another one, usually by computing some mathematical operation. A network is a layer that contains other layers. The most basic type of network is one that activates layers sequentially. Most problems are too complex for only one layer so almost always you will use a network.

The network must take an input image of 28x28 pixels and return 10 values, one for each possible digit.

net, _ := layers.NewSequentialNet(
    layers.NewDenseLayer([]int{28, 28}, []int{30}),  //Dense layer with input size 28x28 and output size 30
    layers.NewSigmoidLayer(30),                      //As a sigmoid layer has the same input and output size, we just define one (30)
    layers.NewDenseLayer([]int{30}, []int{10}),
    layers.NewSoftmaxLayer(10),
)

We also need data, in the form of a struct that implements the DataSet interface. Weight includes some implementations for MNIST, CIFAR, etc. You probably need to implement this interface to fit the needs of your data. See weight/loaders for example implementations.

Most loaders return a PairSet, that's just two DataSet structs: one for training and one for testing.

//Parse the mnist files from the current folder and return a PairSet
data, _ := mnist.Open(".")

What we want to do now is to train the network. We must first define two things: how to evaluate it's output and what learning method to use.

To evaluate the network's output we must use a cost function. It is a function that takes the output of the network for a given data point and the correct answer. It outputs a value that represents how incorrect the output from the network was.

In this case we use a cross entropy cost function of size 10, the same as the output of the net.

costFunc := costs.NewCrossEntropyCostFunction(10)

Weight uses gradient descent to train the network, but there are different methods and parameters to choose from. You can see an overview of them here: http://sebastianruder.com/optimizing-gradient-descent/

In this case we will use momentum with a learning rate that decays from 0.5 to 0.1 for 5 epochs (one epoch is a run through all data points in the train set).

config := training.LearningConfig{
    BatchSize:         16,
    Epochs:            5,
    Method:            training.Momentum,
    LearningRateStart: 0.5,
    LearningRateEnd:   0.1,
    Momentum:          0.9,
}

The only thing that we need now is to create a trainer with all the information and start training.

//Create a trainer with the network, configuration, cost function and data.
trainer := training.NewBPTrainer(config, data, net, costFunc)

//Start training.
_ = trainer.Train()

It is important to test the network after training. The test set contains images that are not in the train set, so it is good to test how it will perform on unknown images. The method TestLayer returns the accuracy of the network from 0 to 1, 1 beeing the perfect score.

//Get final accuracy on test data
accuracy, _ := weight.TestLayer(net, data.TestSet)

You can find and run the full example code in examples/readme. The training takes some seconds and the final accuracy should be around 92%. This result is really bad for MNIST, but with a convolutional neural network we can achieve close to 99%.

Layers implemented

  • FFNet (Feed forward network)
  • Convolutional (3D)
  • Dense/Fully connected
  • Pool
  • ReLU
  • LeakyReLU
  • Sigmoid
  • Softmax

TODO

  • Add way to save and load networks (marshaling)
  • Add GPU computations
  • Allow to configure initialization of parameters
  • Compute backpropagation using col2im
  • More unit testing
  • Recurrent layers

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func TestLayer

func TestLayer(layer Layer, ds DataSet) (float64, error)

TestLayer return accuracy for a given layer and a given DataSet

Types

type BPCostFunc

type BPCostFunc interface {
	CostFunc
	CreateSlave() BPCostFunc
	BackPropagate() *tensor.Tensor
}

BPCostFunc is a CostFunc than can backpropagate error

type BPLearnerLayer

type BPLearnerLayer interface {
	Layer

	//BackPropagate updates the stored gradient for each parameter and returns the backpropagated gradient
	BackPropagate(err *tensor.Tensor) (*tensor.Tensor, error)

	//GetParamGradPointers returns a slice of pointers to prameters and gradients (in the same order) so gradient descent can update them.
	GetParamGradPointers() ([]*float64, []*float64)
}

BPLearnerLayer is a layer that can be trained using backpropagation

type CostFunc

type CostFunc interface {
	Cost(*tensor.Tensor, *tensor.Tensor) float64
}

CostFunc is an object capable of computing the loss value given a result and the correct answer

type DataSet

type DataSet interface {
	GetDataSize() []int
	GetAnswersSize() []int
	GetSetSize() int

	//GetNextSet returns an input, the desired answer, and and error
	GetNextSet() (*tensor.Tensor, *tensor.Tensor, error)

	//After a reset, NextSet will be the first
	Reset()

	Close()

	//IsAnswer returns true if the output is to be considered correct based on the correct answer. For example, in classification data sets, this should return true if the maximum probability in output and answer is in the same label.
	IsAnswer(output *tensor.Tensor, answer *tensor.Tensor) bool
}

DataSet is an interface that returns neural net inputs and tells you if the outputs are correct.

type EnslaverLayer

type EnslaverLayer interface {
	//TODO maybe merge this with BPLearnerLayer
	Layer

	//A slave is a copy of the layer but with the parameters (for example weghts & biases) as a pointer to the parameters of the original layer. This is used to learn a batch in parallel. The learning method should be able to call different slave's Activate and Backpropagate methods in parallel. The parameter's update will be done syncronously, only on the base layer using the gradients from all slave layers.
	CreateSlave() Layer
}

EnslaverLayer is a layer than can create slaves of itself

type Layer

type Layer interface {
	ID() string

	//Activate takes and input tensor and computes an output tensor given the parameters and configuration of the layer
	Activate(input *tensor.Tensor) (*tensor.Tensor, error)
	GetInputSize() []int
	GetOutputSize() []int
}

Layer is any object that accepts tensor.Tensor and returns tensor.Tensor. Each tensor.Tensor has a shape specified by GetSize. Layers only accept a specific shape. If any shape does not match the requirements the Layer will return error on Activate.

type PairSet

type PairSet struct {
	TrainSet DataSet
	TestSet  DataSet
}

PairSet binds a Train and a Test Set together.

func (*PairSet) Close

func (ps *PairSet) Close()

Close closes both sets

Directories

Path Synopsis
examples
loaders

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL