Ir al contenido

Documat


New insights into evaluation of regression models through a decomposition of the prediction errors: application to near-infrared spectral data

  • Autores: María Isabel Sánchez Rodríguez Árbol académico, Elena M. Sánchez López Árbol académico, José María Caridad y Ocerín Árbol académico, Alberto Marinas Aramendia, José María Marinas Rubio Árbol académico, Francisco José Urbano Navarro
  • Localización: Sort: Statistics and Operations Research Transactions, ISSN 1696-2281, Vol. 37, Nº. 1, 2013, págs. 57-78
  • Idioma: inglés
  • Enlaces
  • Resumen
    • This paper analyzes the performance of linear regression mo dels taking into account usual criteria such as the number of principal components or latent factors , the goodness of fit or the predictive capability. Other comparison criteria, more common in an ec onomic context, are also considered:

      the degree of multicollinearity and a decomposition of the m ean squared error of the prediction which determines the nature, systematic or random, of the pr ediction errors. The applications use real data of extra-virgin oil obtained by near-infrared spe ctroscopy. The high dimensionality of the data is reduced by applying principal component analysi s and partial least squares analysis.

      A possible improvement of these methods by using cluster ana lysis or the information of the relative maxima of the spectrum is investigated. Finally, o btained results are generalized via cross- validation and bootstrapping.

  • Referencias bibliográficas
    • Andersen, C. M. and Bro, R. (2010). Variable selection in regression-a tutorial. Journal of Chemometrics, 24, 728–737.
    • Anderson, M. (2009). A comparison of nine PLS1 algorithms. Journal of Chemometrics, bf 23, 518–529.
    • Baeten, V., Aparicio, R., Marigheto, N. and Wilson, R. (2003). Manual del aceite de oliva. AMV ediciones, Mundi-Prensa.
    • Barker, M. and Rayens, W. (2003). Partial least squares for discrimination. Journal of Chemometrics, 17, 166–173.
    • Berrueta, L. A., Alonso-Salces, R. M. and Héberger, K. (2007). Supervised pattern recognition in food analysis. Journal of Chromatography...
    • Burnham, K. P. and Anderson, D. R. (2004). Multimodel inference: understanding AIC and BIC in model selection. Sociological Methods &...
    • Climaco-Pinto, R., Barros, A. S., Locquet, N., Schmidtke, L. and Rutledge, D. N. (2009). Improving the detection of significant factors using...
    • Dupuy, N., Duponchel, L., Huvenne, J. P., Sombret, B. and Legrand, P. (1996). Classification of edible fats and oils by principal component...
    • Essi, I. D., Chukuigwe, E. C. and Ojekudo, N. A. (2011). On multicollinearity in nonlinear econometric models with mis-specified error terms...
    • Frank, I. E. and Friedman, J. H. (1993). A statistical view of some chemometrics regression tools. Technometrics, 35(2), 109–135.
    • Gowen, A. A., Downewy, G., Esquerre, C. and O’Donnell, C. P. (2010). Preventing over-fitting in PLS calibration models of near-infrared (NIR)...
    • Greenberg, E. and Parks, R. P. (1997). A predictive approach to model selection and multicollinearity. Journal of Applied Econometrics, 12,...
    • Guillén, M. D. and Ruiz, A. (2003). Rapid simultaneous determination by proton NMR of unsaturation and composition of acyl groups in vegetable...
    • Guldberg, A., Kaas, E., Déqué, M., Yang, S. and Vester Thorsen, S. (2005). Reduction of systematic errors by empirical model correction:...
    • Gurdeniz, G. and Ozen, B. (2009). Detection of adulteration of extra-virgin oil by chemometric analysis of mid-infrared spectral data. Food...
    • Kasemsumran, S., Kang, N., Christy, A. and Ozaki, Y. (2005). Partial least squares processing of nearinfrared spectra for discrimination and...
    • Li, B., Morris, J. and Martin, E. B. (2002). Model selection for partial least squares regression. Chemometrics and Intelligent Laboratory...
    • López-Negrete de la Fuente, R., Garcı́a-Muñoz, S. and Blegler, L. T. (2010). An efficient nonlinear programming strategy for PCA models...
    • Mark, H. (1986). Comparative study of calibration methods for near-infrared reflectance analysis using a nested experimental design. Analytical...
    • Mark, H. and Workman, J. (1986). Effect of repack on calibrations produced for near-infrared reflectance analysis. Analytical Chemistry, 58,...
    • Mevik, B. H. and Cerderkvist, H. R. (2004). Mean squared error of prediction (MSEP) estimates for principal component regression (PCR) and...
    • Mynbaev, K. T. (2011). Regressions with asymptotically collinear regressors. Econometrics Journal, 14, 304–320.
    • Nelson, P. R. C., MacGregor, J. F. and Taylor, P. A. (2006). The impact of missing measurements on PCA and PLS prediction and monitoring applications....
    • Öztürk, B., Yalçin, A. and Özdemir, D. (2010). Determination of olive oil adulteration with vegetable oils by near infrared spectroscopy...
    • Reinaldo, F. T., Martins, J. P. A. and Ferreira, M. M. C. (2008). Sorting variables using informative vectors as a strategy for feature selection...
    • Spanos, A. and McGuirk, A. (2002). The problem of near-multicollinearity revisited: erratic vs systematic volatility. Journal of Econometrics,...
    • Vasquez, V. R. and Whiting, W. B. (2006). Accounting for both random errors and systematic errors in uncertainty propagation analysis of computer...
    • Yamagata, T. (2006). The small sample performance of the Wald test in the sample selection model under the multicollinearity problem. Economics...
    • Yamamoto, H., Yamaji, H., Abe, Y., Harada, K., Waluyo, D., Fukusaki, E., Kondo, A., Ohno, H. and Fukuda, H. (2009). Dimensionality reduction...
    • Zwanenburg, G., Hoefsloot, H. C. J., Westerhuis, J. A., Jansen, J. J. and Smilde, A. K. (2011). ANOVAprincipal component analysis and ANOVA-simultaneous...

Fundación Dialnet

Mi Documat

Opciones de artículo

Opciones de compartir

Opciones de entorno