1 year ago
#380317
Khánh Nguyễn
Bootstrap function for dataframe - passing a function as an argument in R
I am trying to create a bootstrap function for my assignment. The requirement is as follows:
Compute the bootstrap standard error for: -
mean()
and -median()
and - the top quartile and -max()
and - the standard deviation of the price. One way to approach this is to define a new function for each. Another is to write abootstrap_func
function that takes an additional argument calledfun
, and then you call itbootstrap_func(B, v, median)
to have the same effect asbootstrap_median
. Implement this functionbootstrap_func
. Example call to this function:bootstrap_func(1000, vienna_data$price, mean)
. Generalize the function further so that the second argument ($v$) can be a vector or a dataframe. Therefore, the third argument can be a function that takes a vector -- such asmean
-- or a function that takes a dataframe and returns some number -- such as a function that computes a linear model and returns the estimate of the linear model. Use this new function to compute bootstrap estimators for the standard errors of some linear model coefficients on the vienna dataset -- e.g. the effect of stars on prices. You have to define and name a function that returns the coefficient of the right linear model (sayestimate_of_stars_on_prices <- ...
), and pass this function as one of the arguments tobootstrap_func
.
I created the bootstrap function for the vector like this
sim <- function(v) {
sample(v, replace = TRUE)
}
bootstrap_func <- function(B, v, fun) {
sd(replicate(B, fun(sim(v))))
}
quartile <- function(x) {quantile(x, 0.75)}
So I can call an example like this
bootstrap_func(100, hotels_vienna$price, mean)
bootstrap_func(100, hotels_vienna$price, quartile)
And I think it works fine enough. But I have trouble generalizing it to take also the dataframe and the function that gets the coefficient. My function to get the coefficient is
coef <- function(v, y, x) {
Y <- v[,y]
X <- v[,x]
lmm <- lm(Y ~ X, v)
lmm$coefficients[[2]]
}
coef(hotels_vienna, 2, 12) # this works, col2 = price, col12= distance, result = -22.78177
This is my attempt at the generalized code
df_bootstrap_func <- function(B, v, fun, ...) {
new_v <- function(v) {sample(v, replace = TRUE)}
sd(replicate(B, fun(new_v)))
}
df_bootstrap_func(100, hotels_vienna, coef)
# does not work, throw Error in v[, y] : object of type 'closure' is not subsettable
I have tried multiple versions of the df_bootstrap_func but no success, so I think I need a new approach to the coefficient function. I appreciate any input. TIA.
r
function
lm
statistics-bootstrap
coefficients
0 Answers
Your Answer