Research on incremental clustering algorithm for big data

Xiaoqing Yang

Open Access

Research on incremental clustering algorithm for big data

Xiaoqing Yang

| Dec 23, 2022

Applied Mathematics and Nonlinear Sciences

Volume 8 (2023): Issue 2 (July 2023)

About this article

Cite

Page range: 169 - 180

Received: May 30, 2022

Accepted: Jun 16, 2022

DOI: https://doi.org/10.2478/amns.2021.2.00256

Keywords
big data, incremental, clustering algorithm, K-means clustering algorithm, Kalman filter algorithm

This work is licensed under the Creative Commons Attribution 4.0 International License.

As the scale of data becomes larger and larger, clustering processing, a key step in data mining, has important practical significance. Aiming at the problems of time consumption and high clustering errors when the current clustering algorithms deal with massive and dynamic big data, an incremental clustering algorithm is proposed by taking big data as the research object. By exploring the attribute characteristics of big data, four characteristics such as scale, diversity, high speed and value are summarised. For large-scale data streams that have multiple attributes and are acquired one by one, optimise the setting method of the K-means clustering algorithm category centre point, combine the K-means clustering algorithm and the Kalman filter algorithm and measure the distance between data point pairs. Instead of Mahalanobis distance, an incremental clustering algorithm suitable for big data is constructed. Five data sets are selected to carry out example analysis. The results of the algorithm are verified by the algorithm. The proposed algorithm has obvious advantages in the incremental clustering effect of big data. At the same time, it also has efficient and stable computing performance, which meets the expected design requirements and goals.

eISSN:: 2444-8656
Language:: English

Publication timeframe:: Volume Open
Journal Subjects:: Life Sciences, other, Mathematics, Applied Mathematics, General Mathematics, Physics

Journal RSS Feed

Research on incremental clustering algorithm for big data

Published Online: Dec 23, 2022

Page range: 169 - 180

Received: May 30, 2022

Accepted: Jun 16, 2022

DOI: https://doi.org/10.2478/amns.2021.2.00256

Keywords
big data, incremental, clustering algorithm, K-means clustering algorithm, Kalman filter algorithm

© 2023 Xiaoqing Yang, published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Research on incremental clustering algorithm for big data

Published Online: Dec 23, 2022

Page range: 169 - 180

Received: May 30, 2022

Accepted: Jun 16, 2022

DOI: https://doi.org/10.2478/amns.2021.2.00256

Keywordsbig data, incremental, clustering algorithm, K-means clustering algorithm, Kalman filter algorithm

© 2023 Xiaoqing Yang, published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Keywords
big data, incremental, clustering algorithm, K-means clustering algorithm, Kalman filter algorithm