user_predict: Get personalised recommendations from a SAR model

Description

Get personalised recommendations from a SAR model

Usage

user_predict(
  object,
  userdata = NULL,
  k = 10,
  include_seed_items = FALSE,
  backfill = FALSE,
  reftime
)
set_sar_threads(n_threads)

Value

For user_predict, a data frame containing one row per user ID supplied (or if no IDs are supplied, exactly one row).

Arguments

object: A SAR model object.
userdata: A vector of user IDs, or a data frame containing user IDs and/or transactions. See below for the various ways to supply user information for predicting, and how they affect the results.
k: The number of recommendations to obtain.
include_seed_items: Whether items a user has already seen should be considered for recommendations.
backfill: Whether to backfill recommendations with popular items.
reftime: The reference time for discounting timestamps. If not supplied, defaults to the latest date in the training data and any new transactions supplied.
n_threads: For set_sar_threads, the number of threads to use. Defaults to half the number of logical cores.

Details

The SAR model can produce personalised recommendations for a user, given a history of their transactions. This history can be based on either the original training data, or new events, based on the contents of userdata argument:

A character vector of user IDs. In this case, personalised recommendations will be computed based on the transactions in the training data, ignoring any transaction event IDs or weights.
A data frame containing transaction item IDs, event types and/or weights, plus timestamps. In this case, all the transactions are assumed to be for a single (new) user. If the event types/weights are absent, all transactions are assigned equal weight.
A data frame containing user IDs and transaction details as in (2). In this case, the recommendations are based on both the training data for the given user(s), plus the new transaction details.

In SAR, the first step in obtaining personalised recommendations is to compute a user-to-item affinity matrix $A$. This is essentially a weighted crosstabulation with one row per unique user ID and one column per item ID. The cells in the crosstab are given by the formula $$sum(wt * 2^(-(t0 - time) / half_life))$$ where wt is obtained from the weight and event columns in the data.

The product of this matrix with the item similarity matrix $S$ then gives a matrix of recommendation scores. The recommendation scores are sorted, any items that the user has previously seen are optionally removed, and the top-N items are returned as the recommendations.

The latter step is the most computationally expensive part of the algorithm. SAR can execute this in multithreaded fashion, with the default number of threads being half the number of (logical) cores. Use the set_sar_threads function to set the number of threads to use.

Examples

Run this code


data(ms_usage)
mod <- sar(ms_usage)

# item recommendations given a vector of user IDs
users <- unique(ms_usage$user)[1:5]
user_predict(mod, userdata=users)

# item recommendations given a set of user IDs and transactions (assumed to be new)
user_df <- subset(ms_usage, user %in% users)
user_predict(mod, userdata=user_df)

# item recomendations for a set of item IDs
items <- unique(ms_usage$item)[1:5]
item_predict(mod, items=items)

# setting the number of threads to use when computing recommendations
set_sar_threads(2)