Ir al contenido

Documat


A framework for dissimilarity-based partitioning clustering of categorical time series

  • Autores: Manuel García-Magariños, José Vilar Árbol académico
  • Localización: Data mining and knowledge discovery, ISSN 1384-5810, Vol. 29, Nº 2, 2015, págs. 466-502
  • Idioma: inglés
  • Texto completo no disponible (Saber más ...)
  • Resumen
    • A new framework for clustering categorical time series is proposed. In our approach, a dissimilarity-based partitioning method is considered. We suggest measuring the dissimilarity between two categorical time series by assessing both closeness of raw categorical values and proximity between dynamic behaviours. For the latter, a particular index computing the temporal correlation for categorical-valued sequences is introduced. The dissimilarity measure is then used to perform clustering by considering a modified version of the $$k$$ k -modes algorithm specifically designed to provide with a better characterization of the clusters. Furthermore, the problem of determining the number of clusters in this framework is analyzed by comparing a range of procedures, including a prediction-based resampling method properly adjusted to deal with our dissimilarity. Several graphical devices to interpret and visualize the temporal pattern of each cluster are also provided. Performance of this clustering methodology is studied on different simulated scenarios and its effectiveness is concluded by comparison with alternative approaches. Real data use is illustrated by analyzing navigation patterns of users visiting a specific news web site.


Fundación Dialnet

Mi Documat

Opciones de artículo

Opciones de compartir

Opciones de entorno