Question Classification Using Extreme Learning Machine on Semantic Features

H. Hardy, Yu-N Cheah


In statistical machine learning approaches for question classification, efforts based on lexical feature space require high computation power and complex data structures. This is due to the large number of unique words (or high dimensionality). Choosing semantic features instead could significantly reduce the dimensionality of the feature space. This article describes the use of Extreme Learning Machine (ELM) for question classification based on semantic features to improve both the training and testing speeds compared to the benchmark Support Vector Machine (SVM) classifier. Improvements have also been made to the head word extraction and word sense disambiguation processes. These have resulted in a higher accuracy (an increase of 0.2%) for the classification of coarse classes compared to the benchmark. For the fine classes, however, there is a 1.0% decrease in accuracy but is compensated by a significant increase in speed (92.1% on average).

Full Text:



Riloff, E., Mann, G. & Phillips, W., Reverse-Engineering Question/Answer Collections from Ordinary Text, In T. Strzalkowski and S. Harabagiu (Eds.), Advances in Open Domain Question Answering, Volume 32 of Text, Speech and Language Technology, pp. 505–531. Springer Netherlands, 2006.

Huang, Z., Thint, M. & Qin, Z., Question Classification Using Head Words and Their Hypernyms, In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Honolulu, Hawaii, Association for Computational Linguistics, pp. 927–936, 2008.

Cortes, C. & Vapnik, V., Support Vector Network, Machine Learning, 20(3), pp. 273–297, 1995.

Huang, G.-B., Zhu, Q.-Y. & Siew, C.-K., Extreme Learning Machine: Theory and Applications, Neurocomputing, 70(1-3), pp. 489–501, 2006.

Rumelhart, D.E., Hinton, G.E. & Williams, R.J., Learning Representations by Back-Propagating Errors, In Neurocomputing: Foundations of Research, pp. 696–699, MIT Press, 1988.

Collins, M., Head-Driven Statistical Models for Natural Language Parsing, Ph.D. dissertation, University of Pennsylvania, 1999.

Pasca, M.A. & Harabagiu, S.M., High Performance Question Answering, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, ACM, New Orleans, Louisiana, United States, pp. 366–374, 2001.

Silva, J.a., Coheur, L., Mendes, A. & Wichert, A., From Symbolic to Sub-Symbolic Information in Question Classification, Artificial Intelligence Review, 35, pp. 137-154, 2011.

Fellbaum, C., WordNet: An Electronic Lexical Database, Cambridge, Massachusetts: MIT Press, 1998.

Li, X. & Roth, D., Learning Question Classifiers, In Proceedings of the 19th international conference on Computational Linguistics, 1, Taipei, Taiwan, pp. 1–7. Association for Computational Linguistics, 2002.

Even-Zohar, Y. & Roth, D., A Sequential Model for Multi Class Classification, In Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing, Pittsburgh, Pennsylvania, pp. 10-19, 2001.

Carlson, A. J., Cumby, C.M., Rosen, J.L. & Roth, D., Snow User Guide, Technical Report, Computer Science Department, University of Illinois at Urbana-Champaign, 1999.

Cumby, C. & Roth, D., Relational Representations That Facilitate Learning, In Proceedings of the Seventh International Conference on the Principles of Knowledge Representation and Reasoning, Breckenridge, Colorado, pp. 425-434, 2000.

Hacioglu, K. & Ward, W., Question Classification with Support Vector Machines and Error Correcting Codes, In Human Language Technologies 2003: The Conference of the North American Chapter of the Association for Computational Linguistics, Edmonton, Canada, pp. 28-30, 2003.

Dietterich, T.G. & Bakiri, G., Error-Correcting Output Codes: A General Method for Improving Multiclass Inductive Learning Programs, In Ninth National Conference on Artificial Intelligence, Anaheim, pp. 527-577, 2002.

Bikel, D.M., Miller, S., Schwartz, R. & Weischedel, R., Nymble: A High-Performance Learning Name-Finder, In Proceedings of the Fifth Conference on Applied Natural Language Processing, Washington, DC, Association for Computational Linguistics, pp. 194-201, 1997.

Zhang, D. & Lee, W.S., Question Classification Using Support Vector Machines, In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Toronto, Canada, pp. 26-32, ACM, 2003.

Krishnan, V., Das, S. & Chakrabarti, S., Enhanced Answer Type Inference from Questions Using Sequential Models, In Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, Vancouver, British Columbia, Canada, Association for Computational Linguistics, pp. 315-322, 2005.

Nguyen, M.L., Nguyen, T.T. & Shimazu, A., Subtree Mining for Question Classification Problem, In Twentieth Joint International Conference on Artificial Intelligence, Hyderabad, India, pp. 1695-1700, 2007.

Berger, A.L., Pietra, V.J.D. & Pietra, S.A.D., A Maximum Entropy Approach to Natural Language Processing, Comput. Linguist., 22 (1), pp. 39-71, 1996.

Schapire, R.E., A Brief Introduction to Boosting, International Joint Conference on Artificial Intelligence, pp. 1401-1406, 1999.

Lafferty, J., McCallum, A. & Pereira, F., Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data, In Proceedings of the Eighteenth International Conference on Machine Learning, Williamstown, Massachusetts, pp. 282–289, 2001.

Lesk, M., Automatic Sense Disambiguation Using Machine Readable Dictionaries: How To Tell A Pine Cone from an Ice Cream Cone, In Proceedings of the 5th Annual International Conference on Systems documentation, Toronto, Ontario, Canada, ACM, pp. 24-26, 1986.

Banerjee, S. & Pedersen, T., An Adapted Lesk Algorithm for Word Sense Disambiguation Using Wordnet, In A. Gelbukh (Ed.), Computational Linguistics and Intelligent Text Processing, Lecture Notes in Computer Science, Springer Berlin/Heidelberg, 2276, pp. 117-171, 2002.

Yarowsky, D. & Florian, R., Evaluating Sense Disambiguation across Diverse Parameter Spaces, Nat. Lang. Eng., 8(4), pp. 293-310, 2002.

Navigli, R., Word Sense Disambiguation: A Survey, ACM Comput. Surv., 41(2), pp. 1–69, 2009.

Petrov, S. & Klein, D., Improved Inference for Unlexicalized Parsing, In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, Rochester, New York, pp. 404-411, 2007.

Li, X. & Roth, D., Learning Question Classifiers: The Role of Semantic Information, Natural Language Engineering, 12(3), pp. 229–249, 2006.

Powers, D.M.W., Evaluation: From Precision, Recall and F-Factor to ROC, Informedness, Markedness & Correlation, In Journal of Machine Learning Technologies, 2(1), 2011, pp. 37-63, 2011.



  • There are currently no refbacks.

Contact Information:

ITB Journal Publisher, LPPM – ITB, 

Center for Research and Community Services (CRCS) Building Floor 7th, 
Jl. Ganesha No. 10 Bandung 40132, Indonesia,

Tel. +62-22-86010080,

Fax.: +62-22-86010051;