Ir al contenido

Documat


A comparative clustering model that considers false positives and false negatives in some socioeconomic applications

  • D.E. Urueta-Hinojosa [1] ; P. Lara-Velázquez [1] ; M.A. Gutiérrez-Andrade [1] ; S.G. De-los-Cobos-Silva [1] ; E.A. Rincón-García. [1] ; R.A. Mora-Gutiérrez [1]
    1. [1] Universidad Autónoma Metropolitana-Iztapalapa, Ciudad de México
  • Localización: Fuzzy economic review, ISSN 1136-0593, Vol. 25, Nº. 2, 2020
  • Idioma: inglés
  • DOI: 10.25102/fer.2020.02.03
  • Texto completo no disponible (Saber más ...)
  • Resumen
    • Unsupervised learning enables classifier models to be built quickly and inexpensively in comparison with supervised approaches because the labeling task is eliminated. On the other hand, to assess the quality of a classifier, the only parameter to consider is usually accuracy, treating incorrect predictions like if they had the same importance when in reality the consequences of diagnosing a healthy patient as sick (Type I Error), or diagnosing a sick patient as healthy (Type II Error) are different. That is why, depending on the application, it is preferable to avoid a specific type of error, even if the accuracy decreases. The present work shows a model based on clustering methods that take into account Type I and II Errors to solve medical and business instances using three techniques: k-means, Spectral and Gauss. Based on representative and well-studied datasets for socioeconomic applications, the results show that the accuracy of a model is not a conclusive parameter and to make a decision it is necessary to focus on errors in the confusion matrix which according to each specific instance, take a different meaning and significance. Our results and analysis are discussed to determine the best model for each case study. Finally, conclusions and limitations are analyzed

  • Referencias bibliográficas
    • Bilmes, J. A. (1998). A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov...
    • Colombo, T., Mangone, M., Bernetti, A., Paoloni, M., Santilli, V., & Palagi, L. (2019). Supervised and unsupervised learning to classify...
    • Drineas, P., Frieze, A. M., Kannan, R., Vempala, S., & Vinay, V. (1999, January). Clustering in Large Graphs and Matrices. In SODA (Vol....
    • Drozdov, I., Forbes, D., Szubert, B., Hall, M., Carlin, C., & Lowe, D. J. (2020) Supervised and unsupervised language modelling in Chest...
    • Duch, W. (2010). Benchmark datasets used for classification: Comparison of results. Retrieved from: http://www.is.umk.pl/~duch/projects/projects/datasets.html#Ljubljana.
    • Flores, J., Lara, P., Gutiérrez, M. Á., De los Cobos, S. G. & Rincón, E. A. (2017). Un sistema clasificador utilizando coloración de gráficas...
    • Jain, A. K., & Dubes, R. C. (1988). Algorithms for clustering data. Upper Saddle River, NJ, USA: Prentice-Hall, Inc.
    • Jin, R., Kang, F., & Ding, C. H. (2006). A probabilistic approach for optimizing spectral clustering. In Advances in neural information...
    • Khobahi, S., Agarwal, C., & Soltanalian, M. (2020). CoroNet: A deep network architecture for semisupervised task-based identification...
    • Kim, J.-Y., & Cho, S.-B. (2019). Exploiting deep convolutional neural networks for a neural-based learning classifier system. Neurocomputing,...
    • Kononenko, I. (2001). Machine learning for medical diagnosis: history, state of the art and perspective. Artificial Intelligence in Medicine,...
    • Kumar, N., Venugopal, D., Qiu, L., & Kumar, S. (2019). Detecting Anomalous Online Reviewers: An Unsupervised Approach Using Mixture Models....
    • Kurama, O., & Luukka, P., & Collan, M. (2016). Similarity classifier with weighted ordered weighted averaging operator. Fuzzy Economic...
    • MacQueen, J. B. (1967, June). Some Methods for classification and Analysis of Multivariate Observations. In Proceedings of 5th Berkeley Symposium...
    • Mao, J., & Jain, A. K. (1996). A self-organizing network for hyperellipsoidal clustering (HEC). IEEE transactions on neural networks,...
    • Mehta, R. (2020). On Performance Feasibility and Service Quality of E-Commerce Marketing Based on Fuzzy Logic. Fuzzy Economic Review, 25(1),...
    • Newman, D. J., Hettich, S., Blake, C. L., & Merz, C. J. (1998). UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepositorv.html.
    • Ng, A. Y., Jordan, M. I., & Weiss, Y. (2002). On spectral clustering: Analysis and an algorithm. In Advances in neural information processing...
    • Nilashi, M., Ahmadi, H., Shahmoradi, L., Ibrahim, O., & Akbari, E. (2019). A predictive method for hepatitis disease diagnosis using ensembles...
    • Quinlan, J. R. (1987). Simplifying decision trees. International Journal of Man-Machine Studies, 27(3), 221 -234. https://doi.org/10.1016/S0020-7373(87)80053-6.
    • Ramana, B. V., & Kumar Boddu, R. S. (2019). Performance comparison of classification algorithms on medical datasets. In 2019 IEEE 9th...
    • Refaeilzadeh, P., Tang L., & Liu, H. (2009). Cross-Validation. In L. Liu, M. T. Özsu (Eds), Encyclopedia of Database Systems. Springer,...
    • Ripley, B. D. (2007). Pattern recognition and neural networks. Cambridge: Cambridge University Press.
    • Tanwani, A. K., Afridi, J., Shafiq, M. Z., & Farooq, M. (2009). Guidelines to Select Machine Learning Scheme for Classification of Biomedical...
    • VanderPlas, J. (2016). Python data science handbook: Essential tools for working with data. O'Reilly Media, Inc.
    • Von Luxburg, U. (2007). A tutorial on spectral clustering. Statistics and computing, 17(4), 395-416.
    • Waterhouse, A. L. & Ebeler, S. E. (1998). Chemistry of wine flavor. American Chemical Society. Oxford: Oxford University Press.
    • Yin, W., Zhu, E., Zhu, X., & Yin, J. (2017). Landmark-Based Spectral Clustering with Local Similarity Representation. In D. Du, L. Li,...
    • Yu, L., Wang, S., & Cao, J. (2009). A Modified Least Squares Support Vector Machine Classifier with Application to Credit Risk Analysis....

Fundación Dialnet

Mi Documat

Opciones de artículo

Opciones de compartir

Opciones de entorno