1 year ago

#375042

test-img

infinitefactors

Shapley values for glmnet using R's iml package

I'm unable to use the iml package in R to find shapley values for glmnet models.

It seems like the problem might be related to the fact that glmnet() and predict.glmnet() expect matrices, while the x.interest argument in iml::Shapley$new() expects a data frame, and so something is being incorrectly converted, but I'm not sure.

The most reasonable thing I've tried is below. Because of the following note in the iml::Predictor() documentation, I make sure my prediction function returns estimated probabilities for both classes: "Note: In case of classification, the model should return one column per class with the class probability."

library(dplyr)
library(iml)
library(glmnet)
df <- filter(iris, Species != 'setosa')
X <- as.matrix(select(train, -Species))
y <- droplevels(df$Species)
fit <- glmnet(X, y, family = 'binomial', lambda = 0.03)

predfun <- function(model, newdata) {
  preds <- predict(model, as.matrix(newdata), type = 'response') # probabilities
  return(cbind(1 - preds, preds)) # for both classes 
}

# Pass data frames, as requested
mod <- Predictor$new(fit, as.data.frame(X), predict.function = predfun) 
shapley <- Shapley$new(mod, x.interest = as.data.frame(X[1, ]))

This gives me the following: Error in predict.glmnet(model, as.matrix(newdata), type = "response"): The number of variables in newx must be 4

I'm not really sure what is being passed to predict.glmnet() that doesn't have four variables (it doesn't seem to have to do with an intercept from things I've tried). I've looked at the source code for Shapley$new() and also stepped for quite a while through a call via browser() but wasn't able to come up with anything useful.

Any ideas? Thank you!

r

glmnet

iml

shapley

0 Answers

Your Answer

Accepted video resources