Foundations of Domain-specific Large Language Models for Islamic Studies: A Comprehensive Review
DOI:
https://doi.org/10.5614/itbj.ict.res.appl.2025.19.1.4Keywords:
bias mitigation, ethical AI, fiqh, Islamic studies, large language models, natural language processing, transformer architectureAbstract
Large language models (LLMs) have undergone rapid evolution and are highly effective in tasks such as text generation, question answering, and context-driven analysis. However, the unique requirements of Islamic studies, where textual authenticity, diverse jurisprudential interpretations, and deep semantic nuances are critical, present challenges for general LLMs. This article reviews the evolution of neural language models by comparing the historical progression of general LLMs with emerging Islamic-specific LLMs. We discuss the technical foundations of modern Transformer architectures and examine how recent advancements, such as GPT-4, DeepSeek, and Mistral, have expanded LLM capabilities. The paper also highlights the limitations of standard evaluation metrics like perplexity and BLEU in capturing doctrinal, ethical, and interpretative accuracy. To address these gaps, we propose specialized evaluation metrics to assess doctrinal correctness, internal consistency, and overall reliability. Finally, we outline a research roadmap aimed at developing robust, ethically aligned, and jurisprudentially precise Islamic LLMs.
Downloads
References
Devlin, J., Chang, M., Lee, K. & K. Toutanova, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, in Proc. NAACL-HLT, pp. 4171-4186, 2019. DOI: 10.18653/v1/N19-1423.
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D. & Sutskever, I., Language Models are Unsupervised Multitask Learners, OpenAI technical report, 2019. (PDF: https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf).
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W. & Liu, P.J., Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(1), pp. 5485-5551, 2020. DOI: 10.48550/arXiv.1910.10683.
Tariq, M., Awan, M.A. & Khaleeq, D., Developing a Quranic QA System: Bridging Linguistic Gaps in Urdu Translation using NLP and Transformer Model. IJIST, 7(1), pp. 493-505. Mar, 2025.
Kamali, M.H., Principles of Islamic Jurisprudence, 3rd ed. Cambridge, U.K.: Islamic Texts Society, 2003. ISBN: 9780946621811. ark:/13960/t8pd3br6c
Hagendorff, T., The Ethics of AI Ethics: An Evaluation of Guidelines, Minds Mach., 30(1), pp. 99-120, 2020. DOI: 10.1007/s11023-020-09517-8.
Falletti, E., Algorithmic Discrimination and Privacy Protection. JDTL, 1(2), pp. 387-420, 2023. DOI: 10.21202/jdtl.2023.16.
Mittelstadt, B., Principles Alone Cannot Guarantee Ethical AI, Nat. Mach. Intel., 1, pp. 501-507, 2019. DOI: 10.1038/s42256-019-0114-4.
Brundage, M., Avin, Brundage, M., Avin, S., Wang, J., Belfield, H., Krueger, G., Hadfield, G., Khlaaf, H., Yang, J., Toner, H., Fong, R., Maharaj, T., Koh, P.W., Hooker, S., Leung, J., Trask, A., Bluemke, E., Lebensold, J., O'Keefe, C., Koren, M., Ryffel, T., Rubinovitz, J.B., Besiroglu, T., Carugati, F., Clark, J., Eckersley P., de Haas, S., Johnson, M., Laurie, B., Ingerman, A., Krawczuk, I., Askell, A., Cammarota, R., Lohn, A., Krueger, D., Stix, C., Henderson, P., Graham, L., Prunkl, C., Martin, B., Seger, E., Zilberman, N., hgeartaigh, S., Kroeger, F., Sastry, G., Kagan, R., Weller, A., Tse, B., Barnes, E., Dafoe, A., Scharre, P., Herbert-Voss, A., Rasser, M., Sodhani, S., Flynn, C., Gilbert, T.K., Dyerm L., Khan, S., Bengio, Y. & Anderljung, M., Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims, arXiv e-prints, Art. no. arXiv:2004.07213, 2020. DOI: 10.48550/arXiv.2004.07213.
Bryson, J., Patiency is Not a Virtue: The Design of Intelligent Systems and Systems Of Ethics, Ethics Inf. Technol., 20, pp. 15-26, 2018. DOI: 10.1007/s10676-018-9448-6.
Berendt, B., AI for the Common Good?! Pitfalls, Challenges, and Ethics Pen-Testing, Paladyn, J. Behav. Robot., 10(1), pp. 44-65, 2019. DOI: 10.1515/pjbr-2019-0004.
Mikolov, T., Chen, K., Corrado, G. & J. Dean, Efficient Estimation of Word Representations in Vector Space, in Proc. ICLR, 2013. DOI: 10.48550/arXiv.1301.3781.
Pennington, J., Socher, R. & Manning, C. GloVe: Global Vectors for Word Representation, in Proc. Empir. Methods Nat. Lang. Process., (EMNLP), pp. 1532-1543, Doha, Qatar, 2014. DOI: 10.3115/v1/D14-1162.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L. & Polosukhin, I., Attention Is All You Need, in Proc. 31st Int. Conf. Neural Inf. Process. Syst. (NIPS), pp. 5998-6008, Long Beach, CA, USA, 2017. DOI: 10.48550/arXiv.1706.03762.
Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S.,. So, C.H & Kang, J., BioBERT: A Pre-Trained Biomedical Language Representation Model for Biomedical Text Mining, Bioinformatics, 36(4), pp. 1234-1240, 2020. DOI: 10.1093/bioinformatics/btz682.
Alsentzer, E., Murphy, J., Boag, W., Wei, W.-H., Jin, D., Naumann, T. & McDermott, M., Publicly Available Clinical BERT Embeddings, ClinicalNLP (NAACL), 2019. DOI: 10.48550/arXiv.1904.03323.
Huang, A.-H., Wang, H. & Yang, Y., FinBERT - A Large Language Model for Extracting Information from Financial Text, Cont. Acc. Res., 2020. DOI: 10.2139/ssrn.3910214.
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H. & Neubig, G., Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing, ACM Comput. Surv., 55(9), pp. 1-35, 2023. DOI: 10.1145/3560815.
Alnefaie, S., Atwell, E. & Alsalka, M.A., Islamic Question Answering Systems Survey and Evaluation Criteria, Int. J. Islam. App. CS & Tech. (IJIACST), 11(1), pp. 9-18, 2023. oai:eprints.whiterose.ac.uk:206757.
Radford, A., Narasimhan, K., Salimans, T. & Sutskever, I., Improving Language Understanding by Generative Pre-training, OpenAI, 2018.
Papineni, K., Roukos, S., Ward, T. & Zhu, W.-J., BLEU: A Method for Automatic Evaluation of Machine Translation, in Proc. 40th Annu. Meet. Assoc. Comput. Linguist. (ACL), pp. 311-318, 2002. DOI: 10.3115/1073083.1073135.
Zhou, Y. & Srikumar, V., A Closer Look at How Fine-tuning Changes BERT, in Proc. 60th Ann. Meet. ACL Conf., pp. 1046-1061, Dublin, Ireland, 2022. DOI: 10.18653/v1/2022.acl-long.75.
Hallaq, W.B., An Introduction to Islamic Law. New York: Cambridge University Press, 2009. DOI: 10.1017/CBO9780511801044.
Kamali, M.H., Shariah Law: An Introduction, Oneworld Publications, 2008. ISBN 978-1-85168-565-3.
Binbeshr, F., Kamsin, A. & Mohammed, M., A Systematic Review on Hadith Authentication and Classification Methods, ACM Trans. Asian Low-Resour. Lang. Inf. Process., 20(2), pp 1-17, 2021. DOI: 10.1145/3434236.
Ibrahim, N.K., Noordin, M.F., Samsuri, S., Abu Seman, M.S. & Ali, A.E.B., Isnad Al-Hadith Computational Authentication: An Analysis Hierarchically, Int. Conf. Inf. & Comm. Tech. Muslim World (ICT4M), Jakarta, Indonesia, pp. 344-348, 2016. DOI: 10.1109/ICT4M.2016.075.
AlZahrani F.M. & Al-Yahya, M., A Transformer-based Approach to Authorship Attribution in Classical Arabic Texts.? Appl. Sci., 13(12), 2023. DOI: 10.3390/app13127255.
Usmonov, M., From Human Scholars to AI Fatwas: Media, Ethics, and the Limits of AI in Islamic Religious Communication. J. Cntm. Isl. Comm., 5(1), 2025. DOI: 10.33102/jcicom.vol5no1.125.
Kathir, I., Tafsir Ibn Kathir (abridged), Translated by Shaykh ?af? al-Ra?m?n al-Mub?rakp?r?. 10 vols. Riyadh: Darussalam, 2000.
Chen, X., Wang, X. & Qu, Y., Constructing Ethical AI based on the "Human-in-the-Loop" System, Systems, 11(11), 2023. DOI: 10.3390/systems11110548.
Abdoh, E., Utilizing Modern Technology for the Preservation of Ancient Manuscripts and Rare Books: The Digitization Project at King Abdulaziz Complex for Endowment Libraries as a Model. Restaurator. Int. J. Pres. Lib. Arch. Mat., 46(1), pp. 35-58, 2025. DOI: 10.1515/res-2024-0016.
Ghozali, N.I.M., Mansor, N.S., Awang, H. & Yusof, S.M., Automated Translation Tools for Mualaf: Artificial Intelligence Solutions for Accessing Islamic Texts and Resources Across Languages. Int. J. Islamic Theo. & Civ., 2(3), pp. 60-64, 2024, DOI: 10.5281/zenodo.13943175.
Garg, N., Schiebinger, L., Jurafsky, D. & Zou, J., Word Embeddings Quantify 100 Years of Gender and Ethnic Stereotypes, Proc. Natl. Acad. Sci. U.S.A., 115(16), pp. E3635-E3644, 2018. DOI: 10.1073/pnas.1720347115.
Floridi, L. & Cowls, J., A Unified Framework of Five Principles for AI in Society, Harvard Data Sc. Rev., 1(1), 2019. DOI: 10.1162/99608f92.8cd550d1.
Alhammad, N., Awae, F. Yussuf, A., Al-Awami, A.Y.M., Hayiwaesorhoh, M. & Chehama, A., Using E-learning Platforms in Teaching Islamic Education, J. Islam. Edu. Research, 11(1), pp. 63-69, 2025. DOI: 10.22452/jier.vol11no1.6.
Chukhanov, S. & Kairbekov, N., The Importance of a Semantic Approach in Understanding the Texts of the Holy Quran and Sunnah, Pharos J. Theo., 105(3), pp. 1-11, 2024. DOI: 10.46222/pharosjot.105.36.
Mustafa, M. & Agbaria, A.K., Islamic Jurisprudence of Minorities (Fiqh al-Aqalliyyat): The Case of the Palestinian Muslim Minority in Israel, J. Musl. Min. Aff., 36(2), pp. 184-201, 2016. DOI: 10.1080/13602004.2016.1180889.
Osman, A.M.S., Adopting Comparative Fiqh Methodology in Islamic Jurisprudence: Facing Contemporary Challenges with Ethical Considerations. Al-Mazaahib, 11(2), pp. 115-138. 2023. DOI: 10.14421/al-mazaahib.v11i2.3203.
AlSajri, A., Challenges in Translating Arabic Literary Texts using Artificial Intelligence Techniques, EDRAAK, 2023, pp. 5-10, 2023. DOI: 10.70470/EDRAAK/2023/002.
Ethayarajh, K., & Jurafsky, D., Utility is in the Eye of the User: A Critique of NLP Leaderboards, In Proc. Emp. Meth. NLP (EMNLP), pp. 4846-4853, 2020. DOI: 10.18653/v1/2020.emnlp-main.393.
Cohen, J., A Coefficient of Agreement for Nominal Scales, Educ. Psychol. Meas., 20(1), pp. 37-46, 1960. DOI: 10.1177/001316446002000104.
Efron B. & Tibshirani, R., Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy, Statist. Sci., 1(1), pp. 54-75, 1986. DOI: 10.1214/ss/1177013815.
Hang, C.N., Yu, P.-D. & Tan, C.W., TrumorGPT: Graph-based Retrieval-augmented Large Language Model for Fact-checking, in IEEE Trans. on AI, pp. 1-15, 2025. DOI: 10.1109/TAI.2025.3567369.
Zeng, J., Dai, Z., Liu, H., Varshney, S., Liu, Z., Luo, C., Li, Z., He, Q. & Tang, X., Examples as the Prompt: A Scalable Approach for Efficient LLM Adaptation in E-Commerce, In Proc. Int. ACM SIGIR Conf. on R&D in Info. Ret. (SIGIR). New York, NY, USA, pp. 4244-4248, 2025. DOI: 10.1145/3726302.3731941.
Singhal, K., Azizi, S., Tu, T., Mahdavi, S.S., Wei, J., Chung, H.W., Scales, N., Tanwani, A., Cole-Lewis, H., Pfohl, S., Payne, P., Seneviratne, M., Gamble, P., Kelly, C., Babiker, A., Schli, N., Chowdhery, A., Mansfield, P., Demner-Fushman, D., y Arcas, B.A., Webster, D., Corrado, G.S., Matias, Y., Chou, K., Gottweis, J., Tomasev, N., Liu, Y., Rajkomar, A., Barral, J., Semturs, C., Karthikesalingam, A. & Natarajan, V., Large Language Models Encode Clinical Knowledge, Nature, 620, pp. 172-180, 2023. DOI: 10.1038/s41586-023-06291-2.
Luo, R., Sun, L., Xia, Y., Qin, T., Zhang, S., Poon, H. & Liu, T.-Y., BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation & Mining, Briefings in Bioinformatics, 23(6), 2022. DOI: 10.1093/bib/bbac409.
Chalkidis, I., Fergadiotis, M., Malakasiotis, P., Aletras, N. & Androutsopoulos, I., Legal-BERT: The Muppets Straight Out of Law School, Proc. of Findings of ACL: EMNLP, pp. 2898-2904, 2020. DOI: 10.18653/v1/2020.findings-emnlp.261.


