Ir al contenido

Documat


Resumen de Data-driven choice of the smoothing parametrization for multivariate kernel density estimators

José Enrique Chacón Durán Árbol académico

  • There are several levels of sophistication when specifying the bandwidth matrix H to be used in a multivariate kernel density estimator in Rd. The simplest one consists of taking H to be a positive multiple of the identity matrix, H = h2I. This has the advantage of needing only one smoothing parameter h to be specified, but also the serious disadvantage of applying the same degree of smoothing in each coordinate direction. A slightly more complicated parametrization comes from taking H = diag(h21 , . . . , h2 d). Now d smoothing parameters need to be chosen, but the gain is that a different smoothing level is allowed in each coordinate direction. Finally, the most general form for the bandwidth matrix arises from taking H in F, the class of symmetric, positive definite, d×d matrices. Whereas this parametrization is appropriate when the smoothing is intended for directions different from the coordinate axes, its complex form implies that d(d + 1)/2 smoothing parameters must be specified for its use.

    The usual way of measuring if there is a big gain at the time of taking a complex bandwidth matrix form against a simpler one is by means of the relative efficiency of the two smoothing parametrizations. If this efficiency is known, then it is possible to decide whether to use one parametrization or another according to some prespecified �efficiency threshold�, that is, if the relative efficiency of the simple parametrization is less than a fixed level (chosen by the researcher) then the kernel estimator with a more complex bandwidth matrix should be used.

    Unfortunately, the relative efficiency is usually unknown, since it depends on the unknown underlying density that we aim to estimate. In this paper we propose a method for estimating this relative efficiency, therefore, a data-based method for choosing the smoothing parametrization to be used in a multivariate kernel density estimator. The procedure is fully illustrated by a simulation study and some real data examples.


Fundación Dialnet

Mi Documat