Documentation ¶
Overview ¶
A recommender system package for Go.
Sbr implements cutting-edge sequence-based recommenders: for every user, we examine what they have interacted up to now to predict what they are going to consume next.
Usage ¶
You can fit a model on the Movielens 100K dataset in about 10 seconds using the following (taken from https://github.com/maciejkula/sbr-go/blob/master/examples/movielens/main.go):
import ( "fmt" "math/rand" sbr "github.com/maciejkula/sbr-go" ) data, err := sbr.GetMovielens() if err != nil { panic(err) } fmt.Printf("Loaded movielens data: %v users and %v items for a total of %v interactions\n", data.NumUsers(), data.NumItems(), data.Len()) // Split into test and train. rng := rand.New(rand.NewSource(42)) train, test := sbr.TrainTestSplit(data, 0.2, rng) fmt.Printf("Train len %v, test len %v\n", train.Len(), test.Len()) // Instantiate the model. model := sbr.NewImplicitLSTMModel(train.NumItems()) // Set the hyperparameters. model.ItemEmbeddingDim = 32 model.LearningRate = 0.16 model.L2Penalty = 0.0004 model.NumEpochs = 10 model.NumThreads = 1 // Set random seed var randomSeed [16]byte for idx := range randomSeed { randomSeed[idx] = 42 } model.RandomSeed = randomSeed // Fit the model. fmt.Printf("Fitting the model...\n") loss, err := model.Fit(&train) if err != nil { panic(err) } // And evaluate. fmt.Printf("Evaluating the model...\n") mrr, err := model.MRRScore(&test) if err != nil { panic(err) } fmt.Printf("Loss %v, MRR: %v\n", loss, mrr)
Installation ¶
Run
go get github.com/maciejkula/sbr-go
followed by
make
in the installation directory. This wil download the package's native dependencies. On both OSX and Linux, the resulting binaries are fully statically linked, and you can deploy them like any other Go binary.
Index ¶
- Constants
- func TrainTestSplit(data *Interactions, testFraction float64, rng *rand.Rand) (Interactions, Interactions)
- type ImplicitLSTMModel
- func (self *ImplicitLSTMModel) Fit(data *Interactions) (float32, error)
- func (self *ImplicitLSTMModel) Free()
- func (self *ImplicitLSTMModel) MRRScore(data *Interactions) (float32, error)
- func (self *ImplicitLSTMModel) MarshalBinary() ([]byte, error)
- func (self *ImplicitLSTMModel) Predict(interactionHistory []int, itemsToScore []int) ([]float32, error)
- func (self *ImplicitLSTMModel) UnmarshalBinary(data []byte) error
- type Indexer
- type Interactions
- type Loss
- type Optimizer
Constants ¶
const ( // Bayesian personalised ranking loss. BPR Loss = 0 // Pairwise hinge loss. Hinge Loss = 1 // WARP loss. More accurate in most cases than // the other loss functions at the expense of // fitting speed. WARP Loss = 2 // ADAM optimizer. Adam Optimizer = 0 // Adagrad optimizer. Adagrad Optimizer = 1 )
Variables ¶
This section is empty.
Functions ¶
func TrainTestSplit ¶
func TrainTestSplit(data *Interactions, testFraction float64, rng *rand.Rand) (Interactions, Interactions)
Split the interaction data into training and test sets. The data is split so that there is no overlap between users in training and test sets, making perfomance evaluation reflect the model's perfomance on entirely new users.
Returns a tuple of (training, test) data.
Types ¶
type ImplicitLSTMModel ¶
type ImplicitLSTMModel struct { // Number of items in the model. NumItems int // Maximum sequence length to consider. Setting // this to lower values will yield models that // are faster to train and evaluate, but have // a shorter memory. MaxSequenceLength int // Dimension of item embeddings. Setting this to // higher values will yield models that are slower // to fit but are potentially more expressive (at // the risk of overfitting). ItemEmbeddingDim int // Initial learning rate. LearningRate float32 // L2 penalty. L2Penalty float32 // Whether the LSTM should use coupled forget and update // gates, yielding a model that's faster to train. Coupled bool // Number of threads to use for training. NumThreads int // Number of epochs to use for training. To run more epochs, // call the fit method multiple times. NumEpochs int // Type of loss function to use. Loss Loss // Optimizer to use. Optimizer Optimizer RandomSeed [16]byte // contains filtered or unexported fields }
An implicit-feedback LSTM-based sequence model.
func NewImplicitLSTMModel ¶
func NewImplicitLSTMModel(numItems int) *ImplicitLSTMModel
Build a new model with a capacity to represent a certain number of items. In order to avoid leaking memory, the model must be freed usint its Free method once no longer in use.
func (*ImplicitLSTMModel) Fit ¶
func (self *ImplicitLSTMModel) Fit(data *Interactions) (float32, error)
Fit the model on the supplied data, returning the loss value after fitting. Calling this multiple times will resume training.
func (*ImplicitLSTMModel) Free ¶
func (self *ImplicitLSTMModel) Free()
Free the memory associated with the underlying model.
Unlike other methods of the model, calling Free is _not_ thread safe. Use an external synchronisation method when freeing a model used from multiple goroutines.
func (*ImplicitLSTMModel) MRRScore ¶
func (self *ImplicitLSTMModel) MRRScore(data *Interactions) (float32, error)
Compute the mean reciprocal rank score of the model on supplied interaction data.
Higher MRR values reflect better predictive performance of the model. The score is calculated by taking all but the last interactions of all users as their history, then making predictions for the last item they are going to see.
func (*ImplicitLSTMModel) MarshalBinary ¶
func (self *ImplicitLSTMModel) MarshalBinary() ([]byte, error)
Serialize the model into a byte array. Satisfies the encoding.BinaryMarshaler interface.
func (*ImplicitLSTMModel) Predict ¶
func (self *ImplicitLSTMModel) Predict(interactionHistory []int, itemsToScore []int) ([]float32, error)
Make predictions. Provides scores for itemsToScore for a user who has seen interactionHistory items. Items in the history argument should be arranged chronologically, from the earliest seen item to the latest seen item.
Returns a slice of scores for the supplied items, where a higher score indicates a better recommendation.
func (*ImplicitLSTMModel) UnmarshalBinary ¶
func (self *ImplicitLSTMModel) UnmarshalBinary(data []byte) error
Deserialize the model from a byte array. Satisfies the encoding.BinaryUnmarshaler interface.
type Indexer ¶
type Indexer struct {
// contains filtered or unexported fields
}
Helper for translating user and item ids into contiguous indices.
type Interactions ¶
type Interactions struct {
// contains filtered or unexported fields
}
Contains interactons for training the model.
func GetMovielens ¶
func GetMovielens() (*Interactions, error)
Download and return the Movielens 100K dataset.
func NewInteractions ¶
func NewInteractions(numUsers int, numItems int) Interactions
Construct new empty interactions.
func (*Interactions) Append ¶
func (self *Interactions) Append(userId int, itemId int, timestamp int)
Add a (user, item, timestamp) triple to the dataset.
func (*Interactions) NumItems ¶
func (self *Interactions) NumItems() int
Get the total number of distinct items in the data.
func (*Interactions) NumUsers ¶
func (self *Interactions) NumUsers() int
Get the total number of distinct users in the data.