Cluster Analysis on Dengue Incidence and Weather Data Using K-Medoids and Fuzzy C-Means Clustering Algorithms (Case Study: Spread of Dengue in the DKI Jakarta Province)


  • Cindy Department of Mathematics, Universitas Indonesia, Depok, 16424, Indonesia
  • Cynthia Department of Mathematics, Universitas Indonesia, Depok, 16424, Indonesia
  • Valentino Vito Department of Mathematics, Universitas Indonesia, Depok, 16424, Indonesia
  • Devvi Sarwinda Department of Mathematics, Universitas Indonesia, Depok, 16424, Indonesia
  • Bevina Desjwiandra Handari Department of Mathematics, Universitas Indonesia, Depok, 16424, Indonesia
  • Gatot Fatwanto Hertono Department of Mathematics, Universitas Indonesia, Depok, 16424, Indonesia



Dengue, Dynamic Time Warping distance, Fuzzy C-Means Clustering, K-Medoids Clustering, time-series clustering


In Indonesia, Dengue incidence tends to increase every year but has been fluctuating in recent years. The potential for Dengue outbreaks in DKI Jakarta, the capital city, deserves serious attention. Weather factors are suspected of being associated with the incidence of Dengue in Indonesia. This research used weather and Dengue incidence data for five regions of DKI Jakarta, Indonesia, from December 30, 2008, to January 2, 2017. The study used a clustering approach on time-series and non-time-series data using K-Medoids and Fuzzy C-Means Clustering. The clustering results for the non-time-series data showed a positive correlation between the number of Dengue incidents and both average relative humidity and amount of rainfall. However, Dengue incidence and average temperature were negatively correlated. Moreover, the clustering implementation on the time-series data showed that rainfall patterns most closely resembled those of Dengue incidence. Therefore, rainfall can be used to estimate Dengue incidence. Both results suggest that the government could utilize weather data to predict possible spikes in DHF incidence, especially when entering the rainy season and alert the public to greater probability of a Dengue outbreak.


The Ministry of Health of the Republic of Indonesia, The Dengue Situation in Indonesia, The Ministry of Health of the Republic of Indonesia, 2016. (Text in Indonesian)

The World Health Organization, Dengue and Severe Dengue,, (October 3, 2019).

Angelina, C.R. & Windraswara, R., Factors Related with Dengue Hemorrhagic Fever Incidence in 2008-2017, Unnes Journal of Public Health, 8(1), pp. 64-72, 2019.

Wahyudi, M.Z., Mediana, Ama, K.K., Astuti, R.S., & Ritonga, M.W., Working Together to Overcome Dengue Fever, Kompas,, (March 21, 2020). (Text in Indonesian)

Tomia, A., Hadi, U.K., Soviani, S. & Retnani, E., The Incidence of Dengue Hemorrhagic Fever (DHF) Based on Climatic Factors in the City of Ternate, Indonesian Public Health Media, 12(4), pp. 241-249, 2017. (Text in Indonesian)

Beritagar, The Shadow of Dengue Fever in the Capital City,, (21 December 2019). (Text in Indonesian)

Pangribowo, S., Tryadi, A. & Indah, I.S., Window of Epidemiology Bulletin, Vol 2., The Ministry of Health of the Republic of Indonesia, 1, 2010. (Text in Indonesian)

Sucipto, C.D., Tropical Disease Vector, Gosyen Publishing, 2011.

Alshehri, M.S.A. & Saeed, M., Dengue Fever Outburst and Its Relationship with Climatic Factors, World Applied Sciences Journal, 22(4), pp. 506-515, 2013.

The Ministry of Health of the Republic of Indonesia, Regulation of the Minister of Health Number 035 of 2012 concerning Guidelines for Identification of Health Risk Factors Due to Climate Change, The Ministry of Health of the Republic of Indonesia, 2012. (Text in Indonesian)

Niennattrakul, V. & Ratanamahatana, C.A., On Clustering Multimedia Time Series Data Using K-Means and Dynamic Time Warping, Multimedia and Ubiquitous Engineering, pp. 733-738, 2007.

Hautamaki, V., Nykanen, P. & Franti, P., Time-series Clustering by Approximate Prototypes, International Conference on Pattern Recognition, IEEE, pp. 1-4, 2008.

Shobha, N. & Asha, T., Monitoring Weather Based Meteorological Data: Clustering Approach for Analysis, International Conference on Innovative Mechanisms for Industry Applications (ICIMIA), pp. 75-81, IEEE, 2017.

Shaukat, K., Masood, N., Shafaat, A.B., Jabbar, K. & Shabbir, H., Dengue Fever in Perspective of Clustering Algorithms, Journal of Data Mining in Genomics & Proteomics, 6(176), 2015.

Sangkaew, S., Tan, L.K., Ng, L.C., Ferguson, N.M. & Dorigatti, I., Using Cluster Analysis to Reconstruct Dengue Exposure Patterns from Cross-Sectional Serological Studies in Singapore, Parasites & Vectors, 13(1), pp. 1-10, 2020.

Hariyanto, M. & Shita, R.T., Clustering on Data Mining to Determine the Potential Spread of Dengue using the K-Means Algorithm and the Calculation Method of Euclidean Distance, SKANIKA, 1(1), pp. 117-122, 2018. (Text in Indonesian)

Hasanah & Susanna, D., Weather Implication for Dengue Fever in Jakarta, Indonesia 2008-2016, KnE Life Sciences, 4(10), pp. 184-192, 2019.

Mohibullah, M., Hossain, M.Z. & Hasan, M., Comparison of Euclidean Distance Function and Manhattan Distance Function using K-Medoids, International Journal of Computer Science and Information Security, 13(10), pp. 61-71, 2015.

Kaufman, L. & Rousseeuw, P.J., Finding Groups in Data: An Introduction to Cluster Analysis, John Wiley & Sons Inc., 1990.

Suyanto, D., Data Mining for Data Classification and Clustering, rev. ed., Informatika, 2019. (Text in Indonesian)

The Epidemiological Surveillance Section, DKI Jakarta Health Department,, (July 5, 2019). (Text in Indonesian)

Han, J., Kamber, M. & Pei, J., Data Mining: Concepts and Techniques, 3rd ed., Elsevier, 2012.

Salvador, S. & Chan, P., Toward Accurate Dynamic Time Warping in Linear Time and Space, Intelligent Data Analysis, 11(5), pp. 561-580, 2007.

Bezdek, J.C., Ehrlich, R. & Full, W., FCM: The Fuzzy C-Means Clustering Algorithm, Computers & Geosciences, 10(2-3), pp. 191-203, 1984.

Izakian, H., Pedrycz, W. & Jamal, I., Fuzzy Clustering of Time Series Data Using Dynamic Time Warping Distance, Engineering Applications of Artificial Intelligence, 39, pp. 235-244, 2015.

Liu, Y., Chen, J., Wu, S., Liu, Z. & Chao, H., Incremental Fuzzy C Medoids Clustering of Time Series Data using Dynamic Time Warping Distance, Plos One, 13(5), e0197499, 2018.

Bora, D.J. & Gupta, D.A.K., A Comparative Study between Fuzzy Clustering Algorithm and Hard Clustering Algorithm, International Journal of Computer Trends and Technology, 10(2), pp. 108-113, 2014.

Campello, R.J.G.B. & Hruschka, E.R., A Fuzzy Extension of the Silhouette Width Criterion for Cluster Analysis, Fuzzy Sets and Systems, 157(21), pp. 2858-2875, 2006.

Zaki, M.J. & Meira Jr, W, Data Mining and Analysis: Fundamental Concepts and Algorithms, Cambridge University Press, 2014.

SardEspinosa, A., Comparing Time-series Clustering Algorithms in R Using the DTWCLUST Package, Vienna: R Development Core Team, 2019.

Rousseeuw, P.J., Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis, Journal of Computational and Applied Mathematics, 20, pp. 53-65, 1987.

Wang, W. & Zhang, Y., On Fuzzy Cluster Validity Indices. Fuzzy Sets and Systems, 158(19), pp. 2095-2117, 2007.

Soegijanto, S., Dengue Hemorrhagic Fever, 2nd ed., Airlangga University Press, 2006. (Text in Indonesian)

Tanawi, I.N., Vito, V., Sarwinda, D., Tasman, H. & Hertono, G.F. Support Vector Regression for Predicting the Number of Dengue Incidents in DKI Jakarta, Procedia Computer Science, 179, pp. 747-753, 2021.