Healthcare Data Mining: Predicting Hospital Length of Stay of Dengue Patients

Iwan Inrawan Wiratmadja, Siti Yaumi Salamah, Rajesri Govindaraju


Dengue is regarded as the most important mosquito-borne viral disease. Recently dengue has emerged as a public health burden in Southeast Asia and other tropical countries. At times when dengue re-emerges as an epidemic, hospitals are required to be able to handle patient flow fluctuation while maintaining their performance. This research applied a data mining technique to build a model that can predict in-patient hospital length of stay from the time of admission, which can be useful for effective decision-making that may lead to better clinical and resource management in hospitals. Using the C4.5 algorithm and a decision tree classifier, an accuracy of 71.57% and an area under the receiver operating characteristic (ROC) curve value of 0.761 were obtained. The decision tree showed that only 7 out of 21 input attributes affect the hospital length of stay prediction of dengue patients. The attribute with the highest impact was monocytes, followed by diastolic blood pressure, hematocrit, leucocytes, systolic blood pressure, comorbidity score, and lymphocytes. In this research also a prototype of a prediction system using the resulting model was developed.


data mining; decision tree; dengue; hospital; length of stay; prediction

Full Text:



Farooqi, W. & Ali, S., A Critical Study of Selected Classification Algorithms for Dengue Fever and Dengue Haemraghic Fever, in 11th International Conference on Frontiers of Information Technology, IEEE, pp. 140-145, 2013.

Farooqi, W., Ali, S. & Wahab, A., Classification of Dengue Fever Using Decision Tree, VAWKUM Transactions on Computer Sciences, 3, pp. 15-22, 2014.

Thitiprayoonwongse, D., Suriyaphol, P. & Soonthornphisaj, N., A Data Mining Framework for Building Dengue Infection Disease Model, in The 26th Annual Conference of The Japanese Society for Artificial Intelligence, pp. 1-8, 2012a.

Thitiprayoonwongse, D., Suriyaphol, P. & Soonthornphisaj, N., Data Mining of Dengue Infection Using Decision Tree, in Latest Advances in Information Science and Applications, pp. 154-159, 2012b.

Kumar, M.N., Alternating Decision Trees for Early Diagnosis, Cornell University,, 2013 (Retrieved 23 April 2016).

Shakil, K.A., Anis, S. & Alam, M., Dengue Disease Prediction Using WEKA Data Mining Tool, Cornell University,, 2015 (Retrieved 22 April 2016).

Tanner, L., Schreiber, M., Low, J.G., Ong, A., Tolfvenstam, T., Lai, Y. L., Ng, L.C., Leo, Y.S., Puong, L.T., Vasudevan, S.G., Simmons, C.P., Hibberd, M.L. & Ooi, E.E., Decision Tree Algorithms Predict the Diagnosis and Outcome of Dengue Fever in Early Phase of Illness, PLOS Neglected Tropical Diseases, 2, pp. 1-9, 2008.

Azari, A., Janeja, V.P. & Mohseni, A., Healthcare Data Mining: Predicting Hospital Length of Stay (PHLOS), International Journal of Knowledge Discovery in Bioinformatics, 3(3), pp. 44-66, 2012.

Blais, M.A., Matthews, J., Lipkis-Orlando, R., Lechner, E., Jacobo, M., Lincoln, R., Gulliver, C., Herman, J.B. & Goodman, A.F., Predicting Length of Stay on an Acute Care Medical Psychiatric In-patient Service, Administration and Policy in Mental Health, 31, pp. 15-29, 2003.

Combes, C., Kadri, F. & Chaabane, S., Predicting Hospital Length of Stay Using Regression Models: Application to Emergency Department, in 10ème Conférence Francophone de Modélisation, Optimisation et Simulation,, 2014 (Retrieved 18 February 2016).

Hachesu, P.R., Ahmadi, M., Alizadeh, S. & Sadoughi, F., Use of Data Mining Techniques to Determine and Predict Length of Stay of Cardiac Patients, Healthcare Informatics Research, 19, pp. 121-129, 2013.

Lella, L., di Giorgio, A. & Dragoni, A.F., Length of Stay Prediction and Analysis through a Growing Neural Gas Model, in 4th International Workshop on Artificial Intelligence and Assistive Medicine, pp. 11-21, 2015.

Liu, P., El-Darzi, E., Vasilakis, C., Chountas, P. & Huang, W., Comparative Analysis of Data Mining Algorithms for Predicting In-patient Length of Stay, in Pacific Asia Conference on Information Systems, pp. 1087-1097, 2004.

Tanuja, S., Acharya, U.D. & Shailesh, K., Comparison of Different Data Mining Techniques to Predict Hospital Length of Stay, Journal of Pharmaceutical and Biomedical Sciences, 7, pp. 1-4, 2011.

Yang, C.S., Wei, C.P., Yuan, C.C. & Schoung, J.Y., Predicting The Length of Hospital Stay of Burn Patients: Comparison of Prediction Accuracy among Different Clinical Stages, Decision Support Systems, 50, pp. 325-335, 2010.

Tomar, D. & Agarwal, S., A Survey on Data Mining Approaches for Healthcare, International Journal of Bio-Science and Bio-Technology, 5, pp. 241-266, 2013.

Larose, D.T., Discovering Knowledge in Data: An Introduction to Data Mining, New Jersey: John Wiley & Sons, Inc., 2005.

Lim, T.S., Loh, W.Y. & Shih, Y.S., A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms, Machine Learning, 40, pp. 203-229, 2000.

Carrasco, L.R., Leo, Y.S., Cook, A.R., Lee, V.J., Thein, T.L., Go, C.J. & Lye, D.C., Predictive Tools for Severe Dengue Conforming to World Health Organization 2009 Criteria, PLOS Neglected Tropical Diseases, 8, pp. 1-9, 2014.

Xiao, J., Douglas, D., Lee, A. H. & Vemuri, S. R., A Delphi Evaluation of The Factors Influencing Length of Stay in Australian Hospitals, International Journal of Health Planning and Management, 12, pp. 207-218, 1997.

Lee, V.J., Lye, D.C., Sun, Y., Fernandez, G., Ong, A. & Leo, Y.S., Predictive Value of Simple Clinical and Laboratory Variables for Dengue Hemorrhagic Fever in Adults, Journal of Clinical Virology, 42, pp. 34-39, 2008.

Thein, T.L., Leo, Y.S., Fisher, D.A., Low, J.G., Oh, H.M., Gan, V.C., Wong, J.G.X. & Lye, D.C., Risk Factors for Fatality among Confirmed Adult Dengue In-patients in Singapore: A Matched Case-Control Study, PLOS ONE, 8, pp. 1-6, 2013.

Aroor, A.R., Saya, R.P., Sharma, A., Venkatesh, A. & Alva, R., Clinical Manifestations and Predictors of Thrombocytopenia in Hospitalized Adults with Dengue Fever, North American Journal of Medical Sciences, 7, pp. 547-552, 2015.

Ho, T.S., Wang, S.M., Lin, Y.S. & Liu, C.C., Clinical and Laboratory Predictive Markers for Acute Dengue Infection, Journal of Biomedical Science, 20, pp. 1-8, 2013.

Kalayanarooj, S., Vaughn, D., Nimmannitya, S., Green, S., Suntayakorn, S., Kunentrasai, N., Viramitrachai, W., Ratanachu-eke, S., Kiatpolpoj, S., Innis, B.L., Rothman, A.L., Nisalak, A. & Ennis, F.A., Early Clinical and Laboratory Indicators of Acute Dengue Illness, The Journal of Infectious Diseases, 176, pp. 313-321, 1997.

Premaratna, R., Pathmeswaran, A., Amarasekara, N., Motha, M., Perera, K. & Silva, H.D., A Clinical Guide for Early Detection of Dengue Fever and Timing of Investigations to Detect Patients Likely to Develop Complications, Transactions of the Royal Society of Tropical Medicine and Hygiene, 103, pp. 127-131, 2009.

Witten, I.H., Frank, E. & Hall, M.A., Data Mining: Practical Machine Learning Tools and Techniques, 3rd ed., Burlington, MA: Morgan Kaufmann, 2011.

Han, J., Kamber, M. & Pei, J., Data Mining Concepts and Techniques, 3rd ed., Waltham, MA: Morgan Kaufmann, 2012.



  • There are currently no refbacks.