losses

package
v0.12.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 23, 2024 License: Apache-2.0 Imports: 4 Imported by: 0

Documentation

Overview

Package losses have several standard losses that implement train.LossFn interface. They can also be called separately by custom losses.

They all have the same signature that can be used by train.Trainer.

Index

Constants

View Source
const (
	Epsilon16 = 1e-4
	Epsilon32 = 1e-7
	Epsilon64 = 1e-8
)

Variables

View Source
var (
	// ParamHuberLossDelta is the name of the hyperparameter that defines the Huber loss delta.
	// See HuberLossBuilder.
	// It defaults to 1.0
	ParamHuberLossDelta = "huber_loss_delta"
)

Functions

func BinaryCrossentropy

func BinaryCrossentropy(labels, predictions []*Node) *Node

BinaryCrossentropy returns the cross-entropy loss between labels and predictions, for binary classification tasks.

labels and predictions must have the same shape. labels is converted to predictions dtype, and it's expected to convert to 1.0 (for true) or 0.0 for false. So booleans should work, as an int type that is 0 or 1.

It *does not* reduce-mean the losses, they are returned individually for each element of the batch and need to be ReduceAllMean (usually the mean, but it could be the sum also) before used for training.

If there is an extra `labels` `*Node` with the shape of the `labels[0]` (usually simply `[bath_size]`), it is assumed to be weights tensor to be applied to the losses. If there is an extra `labels` `*Node` with booleans and the same dimensions as `labels[0]` (usually simply `batch_size`), it assumed to be a mask tensor to be applied to the losses.

func BinaryCrossentropyLogits

func BinaryCrossentropyLogits(labels, logits []*Node) *Node

BinaryCrossentropyLogits returns the cross-entropy loss between labels and `sigmoid(logits)`, for binary classification tasks. It assumes the predictions are given by `sigmoid(logits)`. This is a more numerically stable and faster implementation than actually taking the sigmoid of the logits and using the equivalent BinaryCrossentropy. labels and logits must have the same shape.

It *does not* reduce-mean the losses, they are returned individually for each element of the batch and need to be ReduceAllMean (usually the mean, but it could be the sum also) before used for training.

labels is converted to predictions dtype, and it's expected to convert to 1.0 (for true) or 0.0 for false. So booleans should work, as an int type that is 0 or 1.

See mathematical derivation of the stable solution in https://www.tensorflow.org/api_docs/python/tf/nn/sigmoid_cross_entropy_with_logits

If there is an extra `labels` `*Node` with the shape of the `labels[0]` (usually simply `[bath_size]`), it is assumed to be weights tensor to be applied to the losses. If there is an extra `labels` `*Node` with booleans and the same dimensions as `labels[0]` (usually simply `batch_size`), it assumed to be a mask tensor to be applied to the losses.

func CategoricalCrossEntropy

func CategoricalCrossEntropy(labels, predictions []*Node) *Node

CategoricalCrossEntropy returns the cross-entropy loss of the predictions, given the labels. The labels are provided in "dense" format, they should have the exact same shape as predictions, and be set 1 for the true (labeled) category, and 0 for the others (one-hot encoding) -- or any other distribution that sums to 1. predictions should hold probabilities that must sum to 1.0.

It *does not* reduce-mean the losses, they are returned individually for each element of the batch and need to be ReduceAllMean (usually the mean, but it could be the sum also) before used for training.

If there is an extra `labels` `*Node` with the shape of logits without the last axis (usually simply `[bath_size]`), it assumed to be weights to the losses. If there is an extra `labels` `*Node` with booleans with the same dimensions as logits without the last axis (usually simply `batch_size`), it assumed to be a mask.

func CategoricalCrossEntropyLogits

func CategoricalCrossEntropyLogits(labels, logits []*Node) *Node

CategoricalCrossEntropyLogits returns the cross-entropy loss of the logits, given the labels. The labels are provided in "dense" format, they should have the exact same shape as logits, and be set 1 for the true (labeled) category, and 0 for the others -- or any other distribution that sum to 1.

It *does not* reduce-mean the losses, they are returned individually for each element of the batch and need to be ReduceAllMean (usually the mean, but it could be the sum also) before used for training.

If there is an extra `labels` `*Node` with the shape of logits without the last axis (usually simply `[bath_size]`), it assumed to be weights to the losses. If there is an extra `labels` `*Node` with booleans with the same dimensions as logits without the last axis (usually simply `batch_size`), it assumed to be a mask.

TODO: implement faster version with logits, see https://github.com/tensorflow/tensorflow/blob/359c3cdfc5fabac82b3c70b3b6de2b0a8c16874f/tensorflow/python/ops/nn_ops.py#L4051

func CheckLabelsForWeightsAndMask added in v0.9.0

func CheckLabelsForWeightsAndMask(weightsShape shapes.Shape, labels []*Node) (weights, mask *Node)

CheckLabelsForWeightsAndMask in the labels slice of tensors -- it is assumed that labels[0] are the actual labels, so they are not considered.

`weightsShape` is the expected shape for weights (if present) and the dimensions for a mask (if present), although a mask is assumed to be of dtype `Bool`.

If weights and masks are present, weights are converted to zero for masked out values (where mask is false).

If there is an extra `labels` `*Node` with the shape of `weightsShape`, it is assumed to be weights. If there is an extra `labels` `*Node` with booleans with the same dimension as `weightsShape`, it is assumed to be a mask.

func MeanAbsoluteError added in v0.4.0

func MeanAbsoluteError(labels, predictions []*Node) (loss *Node)

MeanAbsoluteError returns the mean absolute error between labels and predictions. It uses only the first element of each.

labels and predictions must have the same shape.

If there is an extra `labels` `*Node` with the shape of the `labels[0]` (usually simply `[bath_size]`), it is assumed to be weights tensor to be applied to the losses. If there is an extra `labels` `*Node` with booleans and the same dimensions as `labels[0]` (usually simply `batch_size`), it assumed to be a mask tensor to be applied to the losses.

func MeanSquaredError

func MeanSquaredError(labels, predictions []*Node) (loss *Node)

MeanSquaredError returns the mean squared error between labels and predictions.

labels and predictions must have the same shape.

If there is an extra element in the input labels with the shape of the labels[0] (usually simply `[bath_size]`), it is assumed to be weights tensor to be applied to the losses. If there is an extra element in the input labels with booleans and the same dimensions as `labels[0]` (usually simply `batch_size`), it assumed to be a mask tensor to be applied to the losses.

func SparseCategoricalCrossEntropyLogits

func SparseCategoricalCrossEntropyLogits(labels, logits []*Node) *Node

SparseCategoricalCrossEntropyLogits returns the cross-entropy loss of the logits, given the labels. The labels are provided in "sparse" format, that is, integer numbers from 0 to logits dimension-1. labels and logits must have the same rank, and labels last dimension must be 1.

It *does not* reduce-mean the losses, they are returned individually for each element of the batch and need to be ReduceAllMean (usually the mean, but it could be the sum also) before used for training.

If there is an extra `labels` `*Node` with the shape of logits without the last axis, it assumed to be weights to the losses. If there is an extra `labels` `*Node` with booleans with the same dimensions as logits without the last axis, it assumed to be a mask.

Types

type LossFn added in v0.11.0

type LossFn func(labels, predictions []*Node) (loss *Node)

LossFn is the interface used bye train.Trainer to train models.

It takes as inputs the labels and predictions:

  • labels comes from the dataset.
  • predictions comes from the model.
  • the returned loss will be graph.ReduceAllMean by train.Trainer to a scalar, before being used for gradient descent. That means that the loss function is free to return a loss per example or an already reduced scalar loss.

Most of the predefined losses in package `gomlx/ml/train/losses` assume labels and predictions are both of length one. For multi-head models, it's very easy to write a small custom LossFn that splits the slice and send each label/prediction pair to a predefined loss.

func MakeHuberLoss added in v0.11.0

func MakeHuberLoss(delta float64) LossFn

MakeHuberLoss returns a Huber loss function: it's similar to an L2 (MeanSquaredLoss) close to the target, and it becomes L1 (linear) away from the target.

The delta parameter configures the range where the loss behaves as L2: if the prediction is further than delta it becomes linear. It also defines the slope. A good default value is 1.0.

For the returned loss function:

  • If there is an extra element in the input labels with the shape of the labels[0] (usually simply `[bath_size]`), it is assumed to be weights tensor to be applied to the losses.
  • If there is an extra element in the input labels with booleans and the same dimensions as `labels[0]` (usually simply `batch_size`), it assumed to be a mask tensor to be applied to the losses.
  • The loss is returned per element, and not automatically reduced. train.Trainer will by default take the mean of it.

See https://en.wikipedia.org/wiki/Huber_loss

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL