Ir al contenido

Documat


An Empirical Comparison of EM Initialization Methods and Model Choice Criteria for Mixtures of Skew-Normal Distributions

  • JOSÉ R. PEREIRA [1] ; LEYNE A. MARQUES [1] ; JOSÉ M. DA COSTA [1]
    1. [1] Universidade Federal do Amazonas

      Universidade Federal do Amazonas

      Brasil

  • Localización: Revista Colombiana de Estadística, ISSN-e 2389-8976, ISSN 0120-1751, Vol. 35, Nº. 3, 2012, págs. 457-478
  • Idioma: inglés
  • Títulos paralelos:
    • Una comparación empírica de algunos métodos de inicialización EM y criterios de selección de modelos para mezclas de distribuciones normales asimetricas
  • Enlaces
  • Resumen
    • español

      El presente artículo muestra un estudio de simulación que evalúa el desempeño del algoritmo EM utilizado para determinar estimaciones por máxima verosimilitud de los parámetros de la mezcla finita de distribuciones normales asimétricas. Diferentes métodos de inicialización, así como el número de interacciones necesarias para establecer una regla de parada especificada y algunos criterios de selección del modelo para permitir estimar el número apropiado de componentes de la mezcla han sido considerados. Los resultados indican que el algoritmo genera estimaciones razonables cuando los valores iniciales son obtenidos mediante el método de momentos, que junto con los criterios AIC, BIC, ICL o EDC constituyen una eficaz alternativa en la estimación del número de componentes de la mezcla. Resultados insatisfactorios se verificaron al estimar los parámetros de simetría, principalmente seleccionando un tamaño pequeño para la muestra, y en los casos conocidamente problemáticos en los cuales los componentes de la mezcla están suficientemente separados.

    • English

      We investigate, via simulation study, the performance of the EM algorithm for maximum likelihood estimation in finite mixtures of skew-normal distributions with component specific parameters. The study takes into account the initialization method, the number of iterations needed to attain a fixed stopping rule and the ability of some classical model choice criteria to estimate the correct number of mixture components. The results show that the algorithm produces quite reasonable estimates when using the method of moments to obtain the starting points and that, combining them with the AIC, BIC, ICL or EDC criteria, represents a good alternative to estimate the number of components of the mixture. Exceptions occur in the estimation of the skewness parameters, notably when the sample size is relatively small, and in some classical problematic cases, as when the mixture components are poorly separated.

  • Referencias bibliográficas
    • G., J. M.,Krishnan, T.. (2008). The EM Algorithm and Extensions. 2. John Wiley and Sons.
    • Z., D. B.,Krishnaiah, P. R.,Zhao, L. C.. (1989). 'On rates of convergence of efficient detection criteria in signal processing with white...
    • Akaike, H.. (1974). 'A new look at the statistical model identification'. IEEE Transactions on Automatic Control. 19. 716-723
    • Azzalini, A.. (1985). 'A class of distributions which includes the normal ones'. Scandinavian Journal of Statistics. 12. 171-178
    • Azzalini, A.. (2005). 'The skew-normal distribution and related multivariate families'. Scandinavian Journal of Statistics. 32. 159-188
    • Basso, R. M.,Lachos, V. H.,Cabral, C. R. B.,Ghosh, P.. (2010). 'Robust mixture modeling based on scale mixtures of skew-normal distributions'....
    • Bayes, C. L.,Branco, M. D.. (2007). 'Bayesian inference for the skewness parameter of the scalar skew-normal distribution'. Brazilian...
    • Biernacki, C.,Celeux, G.,Govaert, G.. (2000). 'Assessing a mixture model for clustering with the integrated completed likelihood'....
    • Biernacki, C.,Celeux, G.,Govaert, G.. (2003). 'Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate...
    • Cabral, C. R. B.,Lachos, V. H.,Prates, M. O.. (2012). 'Multivariate mixture modeling using skew-normal independent distributions'....
    • Dempster, A. P.,Laird, N. M.,Rubin, D. B.. (1977). 'Maximum likelihood from incomplete data via the EM algorithm'. Journal of the...
    • DiCiccio, T. J.,Monti, A. C.. (2004). 'Inferential aspects of the skew exponential power distribution'. Journal of the American Statistical...
    • Dias, J. G.,Wedel, M.. (2004). 'An empirical comparison of EM, SEM and MCMC performance for problematic gaussian mixture likelihoods'....
    • F. Greselin,Ingrassia, S.. (2010). 'Constrained monotone EM algorithms for mixtures of multivariate t distributions'. Statistics and...
    • Frühwirth-Schnatter, S.. (2006). Finite Mixture and Markov Switching Models. Springer Verlag.
    • Hastie, T.,Tibshirani, R.,Friedman, J.. (2009). The Elements of Statistical Learning. Data Mining, Inference, and Prediction. Second Edition....
    • Hathaway, R. J.. (1985). 'A constrained formulation of maximum-likelihood estimation for normal mixture models'. The Annals of Statistics....
    • Henze, N.. (1986). 'A probabilistic representation of the skew-normal distribution'. Scandinavian Journal of Statistics. 13. 271-275
    • Ho, H. J.,Pyne, S.,Lin, T. I.. (2012). 'Maximum likelihood inference for mixtures of skew Student-t-normal distributions through practical...
    • Ingrassia, S.. (2004). 'A likelihood-based constrained algorithm for multivariate normal mixture models'. Statistical Methods and...
    • Ingrassia, S.,Rocci, R.. (2007). 'Constrained monotone EM algorithms for finite mixture of multivariate gaussians'. Computational...
    • Karlis, D.,Xekalaki, E.. (2003). 'Choosing initial values for the EM algorithm for finite mixtures'. Computational Statistics and...
    • Lin, T. I.. (2009). 'Maximum likelihood estimation for multivariate skew normal mixture models'. Journal of Multivariate Analysis....
    • Lin, T. I.. (2010). 'Robust mixture modeling using multivariate skew t distributions'. Statistics and Computing. 20. 343-356
    • Lin, T. I.,Lee, J. C.,Hsieh, W. J.. (2007). 'Robust mixture modelling using the skew t distribution'. Statistics and Computing. 17....
    • Lin, T. I.,Lee, J. C.,Ni, H. F.. (2004). 'Bayesian analysis of mixture modelling using the multivariate t distribution'. Statistics...
    • Lin, T. I.,Lee, J. C.,Yen, S. Y.. (2007). 'Finite mixture modelling using the skew normal distribution'. Statistica Sinica. 17. 909-927
    • Lin, T.,Lin, T.. (2010). 'Supervised learning of multivariate skew normal mixture models with missing information'. Computational...
    • McLachlan, G. J.,Peel, G. J.. (2000). Finite Mixture Models. John Wiley and Sons.
    • Meng, X. L.,Rubin, D. B.. (1993). 'Maximum likelihood estimation via the ECM algorithm: a general framework'. Biometrika. 80. 267-278
    • Nityasuddhi, D.,Böhning, D.. (2003). 'Asymptotic properties of the EM algorithm estimate for normal mixture models with component specific...
    • Park, H.,Ozeki, T.. (2009). 'Singularity and slow convergence of the EM algorithm for gaussian mixtures'. Neural Process Letters....
    • Peel, D.,McLachlan, G. J.. (2000). 'Robust mixture modelling using the t distribution'. Statistics and Computing. 10. 339-348
    • (2009). R Development Core Team, R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna.
    • Schwarz, G.. (1978). 'Estimating the dimension of a model'. Annals of Statistics. 6. 461-464
    • Shoham, S.. (2002). 'Robust clustering by deterministic agglomeration EM of mixtures of multivariate t-distributions'. Pattern Recognition....
    • Shoham, S.,Fellows, M. R.,Normann, R. A.. (2003). 'Robust, automatic spike sorting using mixtures of multivariate t-distributions'....
    • Stephens, M.. (2000). 'Dealing with label switching in mixture models'. Journal of the Royal Statistical Society. Series B. 62. 795-809
    • Titterington, D. M.,Smith, A. F. M.,Makov, U. E.. (1985). Statistical Analysis of Finite Mixture Distributions. John Wiley and Sons.
    • Yakowitz, S. J.,Spragins, J. D.. (1968). 'On the identifiability of finite mixtures'. The Annals of Mathematical Statistics. 39. 209-214
    • Yao, W.. (2010). 'A profile likelihood method for normal mixture with unequal variance'. Journal of Statistical Planning and Inference....
Los metadatos del artículo han sido obtenidos de SciELO Colombia

Fundación Dialnet

Mi Documat

Opciones de artículo

Opciones de compartir

Opciones de entorno