An Intelligent System for Predicting Breast Cancer (ISPBC) using a Novel Feature Selection Technique
DOI:
https://doi.org/10.5614/itbj.ict.res.appl.2025.19.2.2Keywords:
breast cancer, enriched feature set, heuristic search techniques, intelligent system, random forest, stochastic hill climbingAbstract
Breast cancer (BC) is becoming a global epidemic, largely affecting women. Breast cancer cases keep climbing steadily. Thus, early detection technologies or systems that notify patients to this disease are essential. Individuals can start treatment for this life-threatening illness, so that patients may be cured or given longer lives. To achieve this, in this study, an expert intelligence system named Intelligent System for Predicting Breast Cancer (ISPBC) was developed. The proposed system utilizes an innovative feature selection technique known as Enriched Feature Set (EFS) in order to identify the most appropriate and significant features. The proposed EFS employs the advantages of heuristic search techniques and stochastic hill climbing to select the most significant and important features. The Decision Tree and Random Forest techniques are employed for breast cancer diagnosis, distinguishing between malignant and benign types. The suggested model?s performance was evaluated by comparing measures such as accuracy, precision, and recall through the utilization of tenfold cross-validation. To measure the efficacy of the suggested model, ISPBC?s performance was compared to that of base classifiers and models published in the literature. A maximum accuracy of 96.09% was attained by ISPBC according to the results.
Downloads
References
Jemal, A., Bray, F., Center, M.M., Ferlay, J., Ward, E. & Forman, D., Global Cancer Statistics, CA: A Cancer Journal for Clinicians, 61(2), pp. 69-90, 2011. DOI: 10.3322/caac.20107.
Chowdhury, S., & Sultana, S., Awareness on Breast Cancer among the Women of Reproductive Age, Journal of Family and Reproductive Health , 5(4), pp. 127-134, 2011. https://www.sid.ir/paper/320831/en.
Akram, M., Iqbal, M., Daniyal, M., & Khan, A. U., Awareness and Current Knowledge of Breast Cancer, Biological Research, 50(1), pp. 1-23, 2017. DOI 10.1186/s40659-017-0140-9.
Ferlay, J., Soerjomataram, I., Dikshit, R., Eser, S., Mathers, C., Rebelo, M., ... & Bray, F., Cancer Incidence and Mortality Worldwide: Sources, Methods and Major Patterns in GLOBOCAN 2012, International Journal of Cancer, 136(5), pp. E359-E386,2012. DOI: 10.1002/ijc.29210.
Das, A.K., Biswas, S.K., Bhattacharya, A. & Alam, E., Introduction to Breast Cancer and Awareness, in 2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS),1, pp. 227-232. IEEE,2021. DOI: 10.1109/ICACCS51430.2021.944168.
DeSantis, C. E., Ma, J., Goding Sauer, A., Newman, L. A., & Jemal, A., Breast Cancer Statistics, Racial Disparity in Mortality by State. CA: A Cancer Journal for Clinicians, 67(6), 439-448, 2017. DOI: 10.3322/caac.21412.
Jaikrishnan, S. V. J., Chantarakasemchit, O. & Meesad, P., A Breakup Machine Learning Approach For Breast Cancer Prediction, in 2019 11th International Conference on Information Technology and Electrical Engineering (ICITEE), pp. 1-6, IEEE.2019. DOI: 10.1109/ICITEED.2019.8929977.
Nelson, H. D., Fu, R., Zakher, B., Pappas, M. & McDonagh, M., Medication Use For The Risk Reduction of Primary Breast Cancer in Women: Updated Evidence Report and Systematic Review for the US Preventive Services Task Force, Jama, 322(9), pp. 868-886, 2019. DOI: 10.1001/jama.2019.5780.
Han, S. J., Guo, Q. Q., Wang, T., Wang, Y. X., Zhang, Y. X., Liu, F., ... & He, Y., Prognostic significance of interactions between ER alpha and ER beta and lymph node status in breast cancer cases, Asian Pacific Journal of Cancer Prevention, 14(10), pp. 6081-6084, 2013. DOI: 10.7314/APJCP.2013.14.10.6081.
Miller, K.D., Ortiz, A.P., Pinheiro, P.S., Bandi, P., Minihan, A., Fuchs, H. E., ... & Siegel, R.L., Cancer Statistics for the US Hispanic/Latino Population, CA: A Cancer Journal for Clinicians, 71(6), pp. 466-487, 2021. DOI: 10.3322/caac.21660.
Giaquinto, A.N., Sung, H., Miller, K.D., Kramer, J.L., Newman, L.A., Minihan, A., ... & Siegel, R.L., Breast Cancer Statistics, CA: A Cancer Journal for Clinicians, 72(6), pp. 524-541, 2022. DOI: 10.3322/caac.21754
Sharma, S., Aggarwal, A., & Choudhury, T., Breast Cancer Detection Using Machine Learning Algorithms, in 2018 International Conference on Computational Techniques, Electronics and Mechanical Systems (CTEMS), pp. 114-118,2018, IEEE. DOI: 10.1109/CTEMS.2018.8769187.
Dai, B., Chen, R. C., Zhu, S. Z., & Zhang, W. W., Using Random Forest Algorithm for Breast Cancer Diagnosis, in 2018 International Symposium on Computer, Consumer and Control (IS3C), pp. 449-452, IEEE, 2018. DOI: 10.1109/IS3C.2018.00119.
Gupta, P., & Garg, S., Breast Cancer Prediction Using Varying Parameters of Machine Learning Models, Procedia Computer Science, 171, pp. 593-601, 2020. DOI: 10.1016/j.procs.2020.04.064.
Kabiraj, S., Raihan, M., Alvi, N., Afrin, M., Akter, L., Sohagi, S.A., & Podder, E., Breast Cancer Risk Prediction Using XGboost and Random Forest Algorithm, in 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1-4. IEEE, 2020. DOI: 10.1109/ICCCNT49239.2020.9225451.
Aroef, C., Rivan, Y. & Rustam, Z., Comparing random forest and support vector machines for breast cancer classification, TELKOMNIKA (Telecommunication Computing Electronics and Control), 18(2), pp. 815-821,2020. DOI: 10.12928/TELKOMNIKA.v18i2.14785.
Wang, S., Wang, Y., Wang, D., Yin, Y., Wang, Y. & Jin, Y., An Improved Random Forest-Based Rule Extraction Method for Breast Cancer Diagnosis, Applied Soft Computing, 86, 105941, 2020. DOI: 10.1016/j.asoc.2019.105941.
Bharati, S., Rahman, M.A. & Podder, P., Breast Cancer Prediction Applying Different Classification Algorithm with Comparative Analysis Using WEKA, In 2018 4th International Conference on Electrical Engineering and Information & Communication Technology (iCEEiCT), pp. 581-584, IEEE, 2018. DOI: 10.1109/CEEICT.2018.8628084.
Montazeri, M., Montazeri, M., Montazeri, M., & Beigzadeh, A., Machine Learning Models in Breast Cancer Survival Prediction, Technology and Health Care, 24(1), pp. 31-42, 2016. DOI: 10.3233/THC-151071.
Octaviani, T.L. & Rustam, D.Z., Random Forest for Breast Cancer Prediction, in AIP Conference Proceedings. AIP Publishing LLC, 2168(1), p. 020050, 2019. DOI: 10.1063/1.5132477.
Sivakami, K. & Saraswathi, N., Mining Big Data: Breast Cancer Prediction Using DT-SVM Hybrid Model, International Journal of Scientific Engineering and Applied Science (IJSEAS), 1(5), pp. 418-429,2015.
Godara, S. & Singh, R., Evaluation of Predictive Machine Learning Techniques as Expert Systems in Medical Diagnosis, Indian Journal of Science and Technology, 9(10), pp. 1-14, 2016. DOI: 10.17485/ijst/2016/v9i10/87212.
Hamsagayathri, P., & Sampath, P., Performance analysis of breast cancer classification using decision tree classifiers, Int J Curr Pharm Res, 9(2), 19-25, 2017.
Yi, L. & Yi, W., Decision Tree Model In The Diagnosis Of Breast Cancer. in 2017 International Conference on Computer Technology, Electronics and Communication (ICCTEC), pp. 176-179, IEEE, 2017. DOI: 10.1109/ICCTEC.2017.00046.
Chaurasia, V., Pal, S. & Tiwari, B. B., Prediction of Benign and Malignant Breast Cancer Using Data Mining Techniques, Journal of Algorithms & Computational Technology, 12(2), pp. 119-126, 2018. DOI: 10.1177/1748301818756225.
Higa, A., Diagnosis of Breast Cancer Using Decision Tree and Artificial Neural Network Algorithms, Cell, 1(7), pp. 23-27, 2018.
Kaur, P., Kumar, R., & Kumar, M., A Healthcare Monitoring System Using Random Forest and Internet of Things (IoT), Multimedia Tools and Applications, 78(14), pp. 19905-19916, 2019. DOI: 10.1007/s11042-019-7327-8.
Ahmed, M.T., Imtiaz, M.N., & Karmakar, A., Analysis of Wisconsin Breast Cancer Original Data Set Using Data Mining and Machine Learning algorithms For Breast Cancer Prediction, Journal of Science Technology and Environment Informatics, 9(2), pp. 665-672, 2020. DOI: 10.18801/jstei.090220.67.
Ed-daoudy, A., & Maalmi, K., Breast Cancer Classification with Reduced Feature Set Using Association Rules and Support Vector Machine, Network Modeling Analysis in Health Informatics and Bioinformatics, 9(1), 34, 2020. DOI: 10.1007/s13721-020-00237-8.
Chakraborty, S., & Murali, B., A Novel Medical Prognosis System for Breast Cancer, in Proceedings of International Conference on Advanced Computing Applications: ICACA 2021, pp. 403-413, Singapore: Springer Singapore, 2021. DOI: 10.1007/978-981-16-5207-3_34.
Dholi, P., & Patil, D. V., A Prognosis and Prediction of Breast Cancer using Machine Learning Techniques, in Proceedings of the 3rd International Conference on Contents, Computing & Communication (ICCCC-2022),2022. DOI: 10.2139/ssrn.4043530.
Zhang, Z., & Li, Z., Evaluation Methods for Breast Cancer Prediction in Machine Learning Field, in SHS Web of Conferences, EDP Sciences, 144, 03010, 2022. DOI: 10.1051/shsconf/202214403010.
Das, A. K., Biswas, S. K., & Mandal, A., An expert system for breast cancer prediction (ESBCP) using decision tree, Indian J Sci Technol, 15(45), pp. 2441-2450, 2022. DOI: 10.17485/IJST/v15i45.756.
Murugan, S., Kumar, B.M. & Amudha, S., Classification and Prediction of Breast Cancer Using Linear Regression, Decision Tree and Random Forest, in 2017 International Conference on Current Trends in Computer, Electrical, Electronics and Communication (CTCEEC), pp. 763-766, IEEE, 2017. DOI: 10.1109/CTCEEC.2017.8455058.
Li, Y. & Chen, Z., Performance Evaluation of Machine Learning Methods for Breast Cancer Prediction, Appl Comput Math, 7(4), pp. 212-216, DOI: 10.11648/j.acm.20180704.15.
Sahu, B., Mohanty, S.N. & Rout, S.K., A Hybrid Approach for Breast Cancer Classification and Diagnosis, EAI Endorsed Transactions on Scalable Information Systems, 6(20), 2019. DOI: 10.4108/eai.19-12-2018.156086.
Mathew, T.E., Simple and Ensemble Decision Tree Classifier Based Detection of Breast Cancer. International Journal of Scientific & Technology Research, 8(11), pp. 1628-1637.
Kaur, P., Kumar, R., & Kumar, M., A Healthcare Monitoring System Using Random Forest and Internet of Things (IoT), Multimedia Tools and Applications, 78(14), pp. 19905-19916, 2019. DOI: 10.1007/s11042-019-7327-8
Islam, M.M., Haque, M.R., Iqbal, H., Hasan, M.M., Hasan, M. & Kabir, M.N., Breast Cancer Prediction: A Comparative Study Using Machine Learning Techniques, SN Computer Science, 1(5), 290, 2020. DOI: 10.1007/s42979-020-00305-w.
Pyingkodi, M., Muthukumaran, M., Shanthi, S. & Saravanan, T.M., Performance Study of Classification Algorithms Using the Microarray Breast Cancer Data Set, International Journal of Future Generation Communication and Networking, 13(2), 12381245, 2020.
Idris, N.F. & Ismail, M.A., Breast Cancer Disease Classification Using Fuzzy-ID3 Algorithm with FUZZYDBD Method: Automatic Fuzzy Database Definition, PeerJ Computer Science, 7, e427,2021. DOI: 10.7717/peerj-cs.427.
Mehta, D., Mohite, A., Shinde, V., Khatri, R. & Dokare, I., Detection of Breast Cancer using Machine Learning Algorithms, in Proceedings of the 7th International Conference on Innovations and Research in Technology and Engineering (ICIRTE-2022), organized by VPPCOE & VA, Mumbai-22, INDIA, 2022. DOI: 10.2139/ssrn.4108758.
https://archive.ics.uci.edu/ml/data sets/breast+cancer+wisconsin+%28original%29. (20 April 2024)
Boukerche, A., Zheng, L. & Alfandi, O., Outlier Detection: Methods, Models, and Classification, ACM Computing Surveys (CSUR), 53(3), pp.1-37, 2020. DOI: 10.1145/3381028.
Sage, A., Random Forest Robustness, Variable Importance, and Tree Aggregation, 2018.
Khourdifi, Y. & Bahaj, M., Applying Best Machine Learning Algorithms for Breast Cancer Prediction and Classification, in 2018 International Conference on Electronics, Control, Optimization and Computer Science (ICECOCS), pp. 1-5, IEEE, 2018. DOI: 10.1109/ICECOCS.2018.8610632.
Das, A.K., Biswas, S.K., Mandal, A., Bhattacharya, A. & Saha, D., Machine Learning Based Expert System for Breast Cancer Prediction (MLESBCP), in International Conference on Computational Technologies and Electronics, Cham: Springer Nature Switzerland, pp. 275-286, 2023. DOI: 10.1007/978-3-031-81935-3_24.
Verma, D. & Mishra, N., Analysis and prediction of breast cancer and diabetes disease data sets using data mining classification techniques, In 2017 International Conference on Intelligent Sustainable Systems (ICISS), pp. 533-538, IEEE, 2017. DOI: 10.1109/ISS1.2017.8389229.
Kharya, S. & Soni, S., Weighted Naive Bayes Classifier: A Predictive Model for Breast Cancer Detection, International Journal of Computer Applications, 133(9), pp. 32-37, 2016. DOI: 10.5120/ijca2016908023.
Alaybeyoglu, A. & Mulay?m, N., A Design of Hybrid Expert System for Diagnosis of Breast Cancer and Liver Disorder, The Eurasia Proceedings of Science Technology Engineering and Mathematics, 2, pp. 345-353, 2018.


