1 year ago

#380317

test-img

Khánh Nguyễn

Bootstrap function for dataframe - passing a function as an argument in R

I am trying to create a bootstrap function for my assignment. The requirement is as follows:

Compute the bootstrap standard error for: - mean() and - median() and - the top quartile and - max() and - the standard deviation of the price. One way to approach this is to define a new function for each. Another is to write a bootstrap_func function that takes an additional argument called fun, and then you call it bootstrap_func(B, v, median) to have the same effect as bootstrap_median. Implement this function bootstrap_func. Example call to this function: bootstrap_func(1000, vienna_data$price, mean). Generalize the function further so that the second argument ($v$) can be a vector or a dataframe. Therefore, the third argument can be a function that takes a vector -- such as mean -- or a function that takes a dataframe and returns some number -- such as a function that computes a linear model and returns the estimate of the linear model. Use this new function to compute bootstrap estimators for the standard errors of some linear model coefficients on the vienna dataset -- e.g. the effect of stars on prices. You have to define and name a function that returns the coefficient of the right linear model (say estimate_of_stars_on_prices <- ...), and pass this function as one of the arguments to bootstrap_func.

I created the bootstrap function for the vector like this

sim <- function(v) {
  sample(v, replace = TRUE)
}
bootstrap_func <- function(B, v, fun) {
  sd(replicate(B, fun(sim(v))))
}
quartile <- function(x) {quantile(x, 0.75)}

So I can call an example like this

bootstrap_func(100, hotels_vienna$price, mean)
bootstrap_func(100, hotels_vienna$price, quartile)

And I think it works fine enough. But I have trouble generalizing it to take also the dataframe and the function that gets the coefficient. My function to get the coefficient is

coef <- function(v, y, x) {
  Y <- v[,y]
  X <- v[,x]
  lmm <- lm(Y ~ X, v)
  lmm$coefficients[[2]]
}
coef(hotels_vienna, 2, 12) # this works, col2 = price, col12= distance, result = -22.78177

This is my attempt at the generalized code

df_bootstrap_func <- function(B, v, fun, ...) {
  new_v <- function(v) {sample(v, replace = TRUE)}
  sd(replicate(B, fun(new_v)))
}
df_bootstrap_func(100, hotels_vienna, coef) 
  # does not work, throw Error in v[, y] : object of type 'closure' is not subsettable

I have tried multiple versions of the df_bootstrap_func but no success, so I think I need a new approach to the coefficient function. I appreciate any input. TIA.

r

function

lm

statistics-bootstrap

coefficients

0 Answers

Your Answer

Accepted video resources