Ir al contenido

Documat


Data Stream Classification based on an Associative Classifier

  • Autores: Karen Pamela López-Medina, Abril Valeria Uriarte-Arcia, Cornelio Yáñez Márquez
  • Localización: Computación y Sistemas (CyS), ISSN 1405-5546, ISSN-e 2007-9737, Vol. 28, Nº. 2, 2024, págs. 387-400
  • Idioma: inglés
  • DOI: 10.13053/cys-28-2-4737
  • Enlaces
  • Resumen
    • Abstract: Currently, the diversity of sources generating data in a massive online manner cause data streams to become part of many real work applications. Learning from a data stream is a very challenging task due to the non-stationary nature of this type of data. Characteristics such as infinite length, concept drift, concept evolution and recurrent concepts are the most common problems that need to be addressed by data stream learning algorithms. In this work an algorithm for data stream classification based on an associative classifier is presented. This proposal combines a clustering algorithm and the Naïve Associative Classifier for Online Data (NACOD) to address this problem. A set of micro-clusters (MCs), a data structure that summarizes the information of the current data, is used instead of storing the whole data. The MCs are continually updated with the arriving data, either to create new MCs or to update existing ones. The added MCs helps to deal with concept drift. To assess the performance of the proposed model, experiments were carried out on 3 data sets commonly used to evaluate data stream classification algorithms: KDD Cup 1999, Forest Cover Type and Statlog (Shuttle). Our model achieved higher accuracies than those achieved with algorithms such as data stream version of Naïve Bayes and Hoeffding Tree, the average accuracies achieved were for KDD Cup 1999: 100 %, Statlog (Shuttle): 99.01 % and Forest Cover Type 70.44 %.

  • Referencias bibliográficas
    • Aggarwal, C. C.,Yu, P. S.,Han, J.,Wang, J.. (2003). A framework for clustering evolving data streams. VLDB Conference. 2003.
    • Bianchini, D.,De-Antonellis, V.,Garda, M.. (2023). A big data exploration approach to exploit in-vehicle data for smart road maintenance....
    • Bifet, A.,Holmes, G.,Kirkby, R.,Pfahringer, B.. (2010). MOA: Massive online analysis. The Journal of Machine Learning Research. 11. 1601
    • Brzezinski, D.,Stefanowski, J.. (2017). Prequential AUC: Properties of the area under the ROC curve for data streams with concept drift. Knowledge...
    • Chen, D.,Yang, Q.,Liu, J.,Zeng, Z.. (2020). Selective prototype-based learning on concept-drifting data streams. Information Sciences. 516....
    • Chen, F.. (2023). Anomaly recognition method of network media large data stream based on feature learning. The International Conference on...
    • Dawid, A. P.. (1984). Present position and potential developments: Some personal views statistical theory the prequential approach. Journal...
    • de-Faria, E. R.,Ponce-de-Leon-Ferreira-Carvalho, A. C.,Gama, J.. (2016). MINAS: Multiclass learning algorithm for novelty detection in data...
    • Domingos, P.,Hulten, G.. (2000). Mining high-speed data streams. sixth ACM SIGKDD international conference on Knowledge discovery and data...
    • Duda, R. O.,Hart, P. E.,Stork, D. G.. (2000). Nonparametric Techniques. Pattern classification. 177
    • Gama, J.,Žliobaitė, I.,Bifet, A.,Pechenizkiy, M.,Bouchachia, A.. (2014). A survey on concept drift adaptation. ACM computing surveys (CSUR)....
    • Halder, B.,Hasan, K. A.,Amagasa, T.,Ahmed, M. M.. (2023). Autonomic active learning strategy using cluster-based ensemble classifier for concept...
    • Hewage, U. H. W. A.,Sinha, R.,Naeem, M. A.. (2023). Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy:...
    • Islam, M. Z.,Lin, Y.,Vokkarane, V. M.,Yu, N.. (2023). Robust learning-based real-time load estimation using sparsely deployed smart meters...
    • Jasiński, M.,Woźniak, M.. (2022). Employing convolutional neural networks for continual learning. International Conference on Artificial...
    • Korycki, Ł.,Krawczyk, B.. (2023). Adversarial concept drift detection under poisoning attacks for robust data stream mining. Machine Learning....
    • MacQueen, J.. (1967). Some methods for classification and analysis of multivariate observations. 1. Fifth Berkeley Symposium on Mathematical...
    • Mahajan, E.,Mahajan, H.,Kumar, S.. (2024). EnsMulHateCyb: Multilingual hate speech and cyberbully detection in online social media. Expert...
    • Mehmood, H.,Khalid, A.,Kostakos, P.,Gilman, E.,Pirttikangas, S.. (2024). A novel edge architecture and solution for detecting concept drift...
    • Mohanapriya, K.,Sangavi, N.,Kanimozhi, A.,Kiruthika, V. R.,Dhivya, P.. (2023). Optimized feed forward neural network for fake and clone account...
    • Pala, O.. (2024). Assessment of the social progress on European Union by logarithmic decomposition of criteria importance. Expert Systems...
    • Souza, V. M.,Silva, D. F.,Batista, G. E.,Gama, J.. (2015). Classification of evolving data streams with infinitely delayed labels. 14th International...
    • Suryawanshi, S.,Goswami, A.,Patil, P. D.,Mishra, V.. (2022). Adaptive windowing based recurrent neural network for drift adaption in non-stationary...
    • Tao, Z.,Huang, S.,Wang, G.. (2023). Prototypes sampling mechanism for class incremental learning. IEEE Access.
    • Vendramin, L.,Campello, R. J.,Hruschka, E. R.. (2010). Relative clustering validity criteria: A comparative overview. Statistical analysis...
    • Villuendas-Rey, Y.,Hernandez-Castaño, J. A.,Camacho-Nieto, O.,Yañez-Marquez, C.,Lopez-Yañez, I.. (2019). NACOD: A naïve associative classifier...
    • Villuendas-Rey, Y.,Rey-Benguría, C. F.,Ferreira-Santiago, A.,Camacho-Nieto, O.,Yáñez-Márquez, C.. (2017). The naïve associative classifier...
    • Wu, Y. M.,Chen, L. S.,Li, S. B.,Chen, J. D.. (2021). An adaptive algorithm for dealing with data stream evolution and singularity. Information...
    • Yao, B.,Ling, G.,Liu, F.,Ge, M. F.. (2023). Multi-source variational mode transfer learning for enhanced PM2.5 concentration forecasting at...
Los metadatos del artículo han sido obtenidos de SciELO México

Fundación Dialnet

Mi Documat

Opciones de artículo

Opciones de compartir

Opciones de entorno