bandwidth
Bandwidth Selectors for Kernel Density Estimation
Description
Bandwidth selectors for Gaussian kernels in density
.
Usage
bw.nrd0(x) bw.nrd(x) bw.ucv(x, nb = 1000, lower = 0.1 * hmax, upper = hmax, tol = 0.1 * lower) bw.bcv(x, nb = 1000, lower = 0.1 * hmax, upper = hmax, tol = 0.1 * lower) bw.SJ(x, nb = 1000, lower = 0.1 * hmax, upper = hmax, method = c("ste", "dpi"), tol = 0.1 * lower)
Arguments
x | numeric vector. |
nb | number of bins to use. |
lower, upper | range over which to minimize. The default is almost always satisfactory. |
method | either |
tol | for method |
Details
bw.nrd0
implements a rule-of-thumb for choosing the bandwidth of a Gaussian kernel density estimator. It defaults to 0.9 times the minimum of the standard deviation and the interquartile range divided by 1.34 times the sample size to the negative one-fifth power (= Silverman's ‘rule of thumb’, Silverman (1986, page 48, eqn (3.31))) unless the quartiles coincide when a positive result will be guaranteed.
bw.nrd
is the more common variation given by Scott (1992), using factor 1.06.
bw.ucv
and bw.bcv
implement unbiased and biased cross-validation respectively.
bw.SJ
implements the methods of Sheather & Jones (1991) to select the bandwidth using pilot estimation of derivatives.
The algorithm for method "ste"
solves an equation (via uniroot
) and because of that, enlarges the interval c(lower, upper)
when the boundaries were not user-specified and do not bracket the root.
The last three methods use all pairwise binned distances: they are of complexity O(n^2) up to n = nb/2
and O(n) thereafter. Because of the binning, the results differ slightly when x
is translated or sign-flipped.
Value
A bandwidth on a scale suitable for the bw
argument of density
.
Note
Long vectors x
are not supported, but neither are they by density
and kernel density estimation and for more than a few thousand points a histogram would be preferred.
Author(s)
B. D. Ripley, taken from early versions of package MASS.
References
Scott, D. W. (1992) Multivariate Density Estimation: Theory, Practice, and Visualization. New York: Wiley.
Sheather, S. J. and Jones, M. C. (1991). A reliable data-based bandwidth selection method for kernel density estimation. Journal of the Royal Statistical Society series B, 53, 683–690. doi: 10.1111/j.2517-6161.1991.tb01857.x. https://www.jstor.org/stable/2345597.
Silverman, B. W. (1986). Density Estimation. London: Chapman and Hall.
Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S. Springer.
See Also
bandwidth.nrd
, ucv
, bcv
and width.SJ
in package MASS, which are all scaled to the width
argument of density
and so give answers four times as large.
Examples
require(graphics) plot(density(precip, n = 1000)) rug(precip) lines(density(precip, bw = "nrd"), col = 2) lines(density(precip, bw = "ucv"), col = 3) lines(density(precip, bw = "bcv"), col = 4) lines(density(precip, bw = "SJ-ste"), col = 5) lines(density(precip, bw = "SJ-dpi"), col = 6) legend(55, 0.035, legend = c("nrd0", "nrd", "ucv", "bcv", "SJ-ste", "SJ-dpi"), col = 1:6, lty = 1)
Copyright (©) 1999–2012 R Foundation for Statistical Computing.
Licensed under the GNU General Public License.