Using the bootstrap for bias reduction
I came across this simple example in Horowitz (2001, p. 3174) that demonstrates that (in these specific circumstances at least), the biascorrected bootstrap estimator has lower MSE by a large factor. The setup is as follows. We have a sample of 10 iid observation, where \( X_{i}\sim N(0,6)\). The goal is then to estimate \( \theta=\exp(\text{E}[X_{i}])\), for which the true value is \( \theta=1\). The plugin estimator is \( \hat{\theta}=\exp\left(\frac{1}{10}\sum_{i=1}^{10}X_{i}\right)\).
Given a realized sample \( \mathbf{x}=(x_{1},\ldots,x_{n})\), the usual bootstrap estimates are obtained by resampling \( m\) times from \( \mathbf{x}\) with replacement, generating the bootstrap samples \( \mathbf{x}_{j}^{*}\), and the bootstrap estimates \( \hat{\theta}_{j}^{*}=\exp\left(\frac{1}{10}\sum_{i=1}^{10}x_{j}^{*}\right)\). Let \( \hat{\theta}^{*}=\frac{1}{m}\sum_{j=1}^{m}\hat{\theta}_{j}^{*}\) be the average across all\( \hat{\theta}_{j}^{*}\). We can then estimate the bias as \( \widehat{\text{Bias}}[\hat{\theta}]=\hat{\theta}^{*}\hat{\theta}\). In R code, this is:
set.seed(1)
data = rnorm(10, 0, sqrt(6))
(thetahat = exp(mean(data)))
#> [1] 1.382411
bs = replicate(1000, {
resample = sample(data, 10, replace = TRUE)
exp(mean(resample))
})
(biashat = mean(bs)  thetahat)
#> [1] 0.2973734
The “debiased” estimate would hence be \( \hat{\theta}\widehat{\text{Bias}}[\hat{\theta}]=2\hat{\theta}\hat{\theta}^{*}\). For the concrete result, this is \( 1.3820.297=1.085\), much closer to the true value \( \theta=1\).
Because we control the datagenerating process and know the true value of \( \theta\), we can repeat the above procedures any number of times and obtain approximations for the MSE’s of \( \hat{\theta}\) and \( \hat{\theta}\widehat{\text{Bias}}[\hat{\theta}]\). The following code accomplishes that for 100 repetitions:
res = replicate(100, {
data = rnorm(10, 0, sqrt(6))
thetahat = exp(mean(data))
bs = replicate(1000, {
resample = sample(data, 10, replace = TRUE)
exp(mean(resample))
})
(debiased = 2 * thetahat  mean(bs))
c(thetahat  1, debiased  1, (thetahat  1) ^ 2, (debiased  1)^2)
})
apply(res, 1, mean)
#> [1] 0.37878143 0.04919049 1.10729457 0.47833810
By making use of the identity \( \text{MSE}[\cdot]=\text{Bias}^{2}[\cdot]+\text{Var}[\cdot]\), we obtain the following results:
Estimator  MSE  Bias  Variance 

\( \hat{\theta}\)  1.107  0.379  0.964 
\( \hat{\theta}\widehat{\text{Bias}}[\hat{\theta}]\)  0.478  0.049  0.476 
Similar to the results reported in Horowitz (2001, p. 3175), there is a large reduction in both bias and MSE. Not reported by Horowitz, but also significant, is the reduction in variance. The true bias^{1} of \( \hat{\theta}\) is \( \exp(0.3)  1 \approx 0.35 \), so the simulation estimate is not far off.
References
Horowitz, Joel L. 2001. “The Bootstrap.” In: Handbook of Econometrics, Volume 5, edited by J. J. Heckman and E. Leamer. Elsevier.

Let \( Y = \frac{1}{10} \sum X_i\), then \( Y \sim N(0, 0.6)\), and \( \hat{\theta}= \exp(Y) \sim \text{LogNormal}(0, 0.6)\). A lognormal random variable has mean \( \exp \left( \frac{\mu + \sigma^2}{2} \right)\), hence \( \text{E}[\hat{\theta}] = \exp(0.3)\). ↩︎