Documentation ¶
Overview ¶
Package losses have several standard losses that implement train.LossFn interface. They can also be called separately by custom losses.
They all have the same signature that can be used by train.Trainer.
Index ¶
- Constants
- Variables
- func BinaryCrossentropy(labels, predictions []*Node) *Node
- func BinaryCrossentropyLogits(labels, logits []*Node) *Node
- func CategoricalCrossEntropy(labels, predictions []*Node) *Node
- func CategoricalCrossEntropyLogits(labels, logits []*Node) *Node
- func CheckLabelsForWeightsAndMask(weightsShape shapes.Shape, labels []*Node) (weights, mask *Node)
- func MeanAbsoluteError(labels, predictions []*Node) (loss *Node)
- func MeanSquaredError(labels, predictions []*Node) (loss *Node)
- func SparseCategoricalCrossEntropyLogits(labels, logits []*Node) *Node
- type LossFn
Constants ¶
const ( Epsilon16 = 1e-4 Epsilon32 = 1e-7 Epsilon64 = 1e-8 )
Variables ¶
var ( // ParamHuberLossDelta is the name of the hyperparameter that defines the Huber loss delta. // See HuberLossBuilder. // It defaults to 1.0 ParamHuberLossDelta = "huber_loss_delta" )
Functions ¶
func BinaryCrossentropy ¶
func BinaryCrossentropy(labels, predictions []*Node) *Node
BinaryCrossentropy returns the cross-entropy loss between labels and predictions, for binary classification tasks.
labels and predictions must have the same shape. labels is converted to predictions dtype, and it's expected to convert to 1.0 (for true) or 0.0 for false. So booleans should work, as an int type that is 0 or 1.
It *does not* reduce-mean the losses, they are returned individually for each element of the batch and need to be ReduceAllMean (usually the mean, but it could be the sum also) before used for training.
If there is an extra `labels` `*Node` with the shape of the `labels[0]` (usually simply `[bath_size]`), it is assumed to be weights tensor to be applied to the losses. If there is an extra `labels` `*Node` with booleans and the same dimensions as `labels[0]` (usually simply `batch_size`), it assumed to be a mask tensor to be applied to the losses.
func BinaryCrossentropyLogits ¶
func BinaryCrossentropyLogits(labels, logits []*Node) *Node
BinaryCrossentropyLogits returns the cross-entropy loss between labels and `sigmoid(logits)`, for binary classification tasks. It assumes the predictions are given by `sigmoid(logits)`. This is a more numerically stable and faster implementation than actually taking the sigmoid of the logits and using the equivalent BinaryCrossentropy. labels and logits must have the same shape.
It *does not* reduce-mean the losses, they are returned individually for each element of the batch and need to be ReduceAllMean (usually the mean, but it could be the sum also) before used for training.
labels is converted to predictions dtype, and it's expected to convert to 1.0 (for true) or 0.0 for false. So booleans should work, as an int type that is 0 or 1.
See mathematical derivation of the stable solution in https://www.tensorflow.org/api_docs/python/tf/nn/sigmoid_cross_entropy_with_logits
If there is an extra `labels` `*Node` with the shape of the `labels[0]` (usually simply `[bath_size]`), it is assumed to be weights tensor to be applied to the losses. If there is an extra `labels` `*Node` with booleans and the same dimensions as `labels[0]` (usually simply `batch_size`), it assumed to be a mask tensor to be applied to the losses.
func CategoricalCrossEntropy ¶
func CategoricalCrossEntropy(labels, predictions []*Node) *Node
CategoricalCrossEntropy returns the cross-entropy loss of the predictions, given the labels. The labels are provided in "dense" format, they should have the exact same shape as predictions, and be set 1 for the true (labeled) category, and 0 for the others (one-hot encoding) -- or any other distribution that sums to 1. predictions should hold probabilities that must sum to 1.0.
It *does not* reduce-mean the losses, they are returned individually for each element of the batch and need to be ReduceAllMean (usually the mean, but it could be the sum also) before used for training.
If there is an extra `labels` `*Node` with the shape of logits without the last axis (usually simply `[bath_size]`), it assumed to be weights to the losses. If there is an extra `labels` `*Node` with booleans with the same dimensions as logits without the last axis (usually simply `batch_size`), it assumed to be a mask.
func CategoricalCrossEntropyLogits ¶
func CategoricalCrossEntropyLogits(labels, logits []*Node) *Node
CategoricalCrossEntropyLogits returns the cross-entropy loss of the logits, given the labels. The labels are provided in "dense" format, they should have the exact same shape as logits, and be set 1 for the true (labeled) category, and 0 for the others -- or any other distribution that sum to 1.
It *does not* reduce-mean the losses, they are returned individually for each element of the batch and need to be ReduceAllMean (usually the mean, but it could be the sum also) before used for training.
If there is an extra `labels` `*Node` with the shape of logits without the last axis (usually simply `[bath_size]`), it assumed to be weights to the losses. If there is an extra `labels` `*Node` with booleans with the same dimensions as logits without the last axis (usually simply `batch_size`), it assumed to be a mask.
TODO: implement faster version with logits, see https://github.com/tensorflow/tensorflow/blob/359c3cdfc5fabac82b3c70b3b6de2b0a8c16874f/tensorflow/python/ops/nn_ops.py#L4051
func CheckLabelsForWeightsAndMask ¶ added in v0.9.0
CheckLabelsForWeightsAndMask in the labels slice of tensors -- it is assumed that labels[0] are the actual labels, so they are not considered.
`weightsShape` is the expected shape for weights (if present) and the dimensions for a mask (if present), although a mask is assumed to be of dtype `Bool`.
If weights and masks are present, weights are converted to zero for masked out values (where mask is false).
If there is an extra `labels` `*Node` with the shape of `weightsShape`, it is assumed to be weights. If there is an extra `labels` `*Node` with booleans with the same dimension as `weightsShape`, it is assumed to be a mask.
func MeanAbsoluteError ¶ added in v0.4.0
func MeanAbsoluteError(labels, predictions []*Node) (loss *Node)
MeanAbsoluteError returns the mean absolute error between labels and predictions. It uses only the first element of each.
labels and predictions must have the same shape.
If there is an extra `labels` `*Node` with the shape of the `labels[0]` (usually simply `[bath_size]`), it is assumed to be weights tensor to be applied to the losses. If there is an extra `labels` `*Node` with booleans and the same dimensions as `labels[0]` (usually simply `batch_size`), it assumed to be a mask tensor to be applied to the losses.
func MeanSquaredError ¶
func MeanSquaredError(labels, predictions []*Node) (loss *Node)
MeanSquaredError returns the mean squared error between labels and predictions.
labels and predictions must have the same shape.
If there is an extra element in the input labels with the shape of the labels[0] (usually simply `[bath_size]`), it is assumed to be weights tensor to be applied to the losses. If there is an extra element in the input labels with booleans and the same dimensions as `labels[0]` (usually simply `batch_size`), it assumed to be a mask tensor to be applied to the losses.
func SparseCategoricalCrossEntropyLogits ¶
func SparseCategoricalCrossEntropyLogits(labels, logits []*Node) *Node
SparseCategoricalCrossEntropyLogits returns the cross-entropy loss of the logits, given the labels. The labels are provided in "sparse" format, that is, integer numbers from 0 to logits dimension-1. labels and logits must have the same rank, and labels last dimension must be 1.
It *does not* reduce-mean the losses, they are returned individually for each element of the batch and need to be ReduceAllMean (usually the mean, but it could be the sum also) before used for training.
If there is an extra `labels` `*Node` with the shape of logits without the last axis, it assumed to be weights to the losses. If there is an extra `labels` `*Node` with booleans with the same dimensions as logits without the last axis, it assumed to be a mask.
Types ¶
type LossFn ¶ added in v0.11.0
type LossFn func(labels, predictions []*Node) (loss *Node)
LossFn is the interface used bye train.Trainer to train models.
It takes as inputs the labels and predictions:
- labels comes from the dataset.
- predictions comes from the model.
- the returned loss will be graph.ReduceAllMean by train.Trainer to a scalar, before being used for gradient descent. That means that the loss function is free to return a loss per example or an already reduced scalar loss.
Most of the predefined losses in package `gomlx/ml/train/losses` assume labels and predictions are both of length one. For multi-head models, it's very easy to write a small custom LossFn that splits the slice and send each label/prediction pair to a predefined loss.
func MakeHuberLoss ¶ added in v0.11.0
MakeHuberLoss returns a Huber loss function: it's similar to an L2 (MeanSquaredLoss) close to the target, and it becomes L1 (linear) away from the target.
The delta parameter configures the range where the loss behaves as L2: if the prediction is further than delta it becomes linear. It also defines the slope. A good default value is 1.0.
For the returned loss function:
- If there is an extra element in the input labels with the shape of the labels[0] (usually simply `[bath_size]`), it is assumed to be weights tensor to be applied to the losses.
- If there is an extra element in the input labels with booleans and the same dimensions as `labels[0]` (usually simply `batch_size`), it assumed to be a mask tensor to be applied to the losses.
- The loss is returned per element, and not automatically reduced. train.Trainer will by default take the mean of it.