Create an explainer object with Shapley weights for test data.

shapr(x, model, n_combinations = NULL, group = NULL)



Numeric matrix or data.frame/data.table. Contains the data used to estimate the (conditional) distributions for the features needed to properly estimate the conditional expectations in the Shapley formula.


The model whose predictions we want to explain. Run shapr:::get_supported_models() for a table of which models shapr supports natively.


Integer. The number of feature combinations to sample. If NULL, the exact method is used and all combinations are considered. The maximum number of combinations equals 2^ncol(x).


List. If NULL regular feature wise Shapley values are computed. If provided, group wise Shapley values are computed. group then has length equal to the number of groups. The list element contains character vectors with the features included in each of the different groups.


Named list that contains the following items:


Boolean. Equals TRUE if n_combinations = NULL or n_combinations < 2^ncol(x), otherwise FALSE.


Positive integer. The number of columns in x


Binary matrix. The number of rows equals the number of unique combinations, and the number of columns equals the total number of features. I.e. let's say we have a case with three features. In that case we have 2^3 = 8 unique combinations. If the j-th observation for the i-th row equals 1 it indicates that the j-th feature is present in the i-th combination. Otherwise it equals 0.


Matrix. This matrix is equal to the matrix R_D in Equation 7 in the reference of link{explain}. The Shapley value for a test observation will be equal to the matrix-vector product of W and the contribution vector.


data.table. Returned object from feature_combinations


data.table. Transformed x into a data.table.


List. The updated_feature_list output from preprocess_data

In addition to the items above, model and n_combinations are also present in the returned object.


Nikolai Sellereite


if (requireNamespace("MASS", quietly = TRUE)) {
  # Load example data
  data("Boston", package = "MASS")
  df <- Boston

  # Example using the exact method
  x_var <- c("lstat", "rm", "dis", "indus")
  y_var <- "medv"
  df0 <- df[, x_var]
  model <- lm(medv ~ lstat + rm + dis + indus, data = df)
  explainer <- shapr(df0, model)

  # 16 (which equals 2^4)

  # Example using approximation
  y_var <- "medv"
  model <- lm(medv ~ ., data = df)
  explainer <- shapr(df, model, n_combinations = 1e3)


  # Example using approximation where n_combinations > 2^m
  x_var <- c("lstat", "rm", "dis", "indus")
  y_var <- "medv"
  model <- lm(medv ~ lstat + rm + dis + indus, data = df)
  explainer <- shapr(df0, model, n_combinations = 1e3)

  # 16 (which equals 2^4)

  # Example using groups
  group <- list(A=x_var[1:2], B=x_var[3:4])

  explainer_group <- shapr(df0, model, group = group)
  # 4 (which equals 2^(#groups))
#> [1] 16
#> Success with message:
#> The columns(s) medv is not used by the model and thus removed from the data.
#> [1] 572
#> Success with message:
#> n_combinations is larger than or equal to 2^m = 16. 
#> Using exact instead.
#> [1] 16
#> [1] 4