Individual Expert Selection and Ranking of Scientific Articles Using Document Length

Fadly Akbar Saputra, Taufik Djatna, Laksana Tri Handoko


Individual expert selection and ranking is a challenging research topic that has received a lot attention in recent years because of its importance related to referencing experts in particular domains and research fund allocation and management. In this work, scientific articles were used as the most common source for ranking expertise in particular domains. Previous studies only considered title and abstract content using language modeling. This study used the whole content of scientific documents obtained from Aminer citation data. The modified weighted language model (MWLM) is proposed that combines document length and number of citations as prior document probability to improve precision. Also, the author’s dominance in a single document is computed using the Learning-to-Rank (L2R) method. The evaluation results using p@n, MAP, MRR, r-prec, and bpref showed a precision enhancement. MWLM improved the weighted language model (WLM) by p@n (4%), MAP (22.5%), and bpref (1.7%). MWLM also improved the precision of a model that used author dominance by MAP (4.3%), r-prec (8.2%), and bpref (2.1%).


document length; individual expert; language model; scientific article; selection and ranking

Full Text:



Lin, S., Hong, W., Wang D. & Li T., A Survey on Expert Finding Techniques, J. Intell. Inf. Syst., 49(2), pp. 255-279, 2017.

Balog, K., Azzopardi, L. & de Rijke, M., Formal Models for Expert Finding in Enterprise Corpora, 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval – SIGIR ’06, pp. 43-50, 2006.

Liang, S. & de Rijke, M., Formal Language Models for Finding Groups of Experts, Information Processing & Management, 52(4), pp. 529-549, 2016.

Dienes, Z. & Perner, J., A Theory of Implicit and Explicit Knowledge, Behavioral and Brain Sciences, 22(5), pp. 735-808, 1999.

Hélie, S. & Sun, R., Incubation, Insight, and Creative Problem Solving: A Unified Theory and a Connectionist Model, Psychol. Rev., 117(3), pp. 994-1024, 2010.

Deng, H., King, I. & Lyu, M.R., Formal Models for Expert Finding on DBLP Bibliography Data, 2008 Eighth IEEE International Conference on Data Mining, pp. 163-172, 2008.

Neshati, M., Hashemi, S.H. & Beigy, H., Expertise Finding in Bibliographic Network: Topic Dominance Learning Approach, IEEE Trans. Cybern., 44(12), pp. 2646-2657, 2014.

Robertson, S.E. & Walker, S., Some Simple Effective Approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval, Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 232-241, 1994.

Kraaij, W., Westerveld, T. & Hiemstra, D., The Importance of Prior Probabilities for Entry Page Search, in Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 27-34, 2002.

Blanco, R. & Barreiro, A., Probabilistic Document Length Priors for Language Models, Advances in Information Retrieval, pp. 394-405, 2008.

Björk, B.C., Roos, A. & Lauri, M., Scientific Journal Publishing: Yearly Volume and Open Access Availability, Inf. Res., 14(1), pp. 394-405, 2009.

Liu, X., Bollen, J., Nelson, M.L., & Van de Sompel, H., Co-authorship Networks in the Digital Library Research Community, Inf. Process. Manag., 41(6), pp. 1462-1480, 2005.

Brandes, U., On Variants of Shortest-Path Betweenness Centrality and Their Generic Computation, Social Networks, 30(2), pp. 136-145, May 2008.

Manning, C.D., Raghavan, P. & Schütze, H., Introduction to Information Retrieval, ed. 1, Cambridge University Press, 2008.

Liu, T.Y., Learning to Rank for Information Retrieval, Foundations and Trends® in Information Retrieval, 3(3), pp. 225-331, Mar. 2009.

Balog, K., Fang, Y., de Rijke, M., Serdyukov, P. & Si, L., Expertise Retrieval, Foundations and Trends® in Information Retrieval, 6(2), pp. 127-256, Feb. 2012.

Porter, M.F., Readings in Information Retrieval, in Jones, K.S. & Willett, P., Eds. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1997, pp. 313-316.

Tang, J., Zhang, J., Yao, L. & Li, J., Extraction and Mining of an Academic Social Network, Proceeding of the 17th international conference on World Wide Web, pp. 1193-1194, 2008.



  • There are currently no refbacks.

Contact Information:

ITB Journal Publisher, LPPM – ITB, 

Center for Research and Community Services (CRCS) Building Floor 7th, 
Jl. Ganesha No. 10 Bandung 40132, Indonesia,

Tel. +62-22-86010080,

Fax.: +62-22-86010051;