Context-Aware Sentiment Analysis using Tweet Expansion Method
DOI:
https://doi.org/10.5614/itbj.ict.res.appl.2022.16.2.3Keywords:
embedding, neural networks, sentiment analysis, tweet enrichment, deep learningAbstract
The large source of information space produced by the plethora of social media platforms in general and microblogging in particular has spawned a slew of new applications and prompted the rise and expansion of sentiment analysis research. We propose a sentiment analysis technique that identifies the main parts to describe tweet intent and also enriches them with relevant words, phrases, or even inferred variables. We followed a state-of-the-art hybrid deep learning model to combine Convolutional Neural Network (CNN) and the Long Short-Term Memory network (LSTM) to classify tweet data based on their polarity. To preserve the latent relationships between tweet terms and their expanded representation, sentence encoding and contextualized word embeddings are utilized. To investigate the performance of tweet embeddings on the sentiment analysis task, we tested several context-free models (Word2Vec, Sentence2Vec, Glove, and FastText), a dynamic embedding model (BERT), deep contextualized word representations (ELMo), and an entity-based model (Wikipedia). The proposed method and results prove that text enrichment improves the accuracy of sentiment polarity classification with a notable percentage.
Downloads
References
Dijck, J.V., Tracing Twitter: The Rise of a Microblogging Platform, International Journal of Media & Cultural Politics, 7(3), pp. 333-348, 2011.
Buccoliero, L., Bellio, E., Crestini, G. & Arkoudas, A., Twitter and Politics: Evidence from the US Presidential Elections 2016, Journal of Marketing Communications, 26(1), pp. 88-114, 2020.
Priem, J. & Costello, K. L., How and Why Scholars Cite on Twitter, Proceedings of the American Society for Information Science and Technology, 47(1), pp. 1-4, 2010.
Setiawan, E.B., Widyantoro, D. H. & Surendro, K., Feature Expansion Using Word Embedding for Tweet Topic Classification, in 10th International Conference on Telecommunication Systems Services and Applications (TSSA) , IEEE, pp. 1-5, 2016.
Kacmajor, M. & Kelleher, J.D., Capturing and Measuring Thematic Relatedness, Language Resources and Evaluation, 54(3), pp. 645-682, 2020.
Go, A., Bhayani, R. & Huang, L., Twitter Sentiment Classification Using Distant Supervision, CS224N project report, Stanford, 1(12), p.2009..
Bojanowski, P., Grave, E., Joulin, A. & Mikolov, T., Enriching Word Vectors with Subword Information, Transactions of the Association for Computational Linguistics, 5, pp. 135-146, 2017.
Mikolov, T., Chen, K., Corrado, G. & Dean, J., Efficient Estimation of Word Representations in Vector Space, arXiv preprint arXiv:1301.3781, 2013.
Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K. & Zettlemoyer, L., Deep Contextualized Word Representations, arXiv preprint arXiv:1802.05365, 2018.
Devlin, J., Chang, M. W., Lee, K. & Toutanova, K., Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding, arXiv preprint arXiv:1810.04805, 2018.
Shoukry, A. & Rafea, A., A hybrid approach for sentiment classification of Egyptian Dialect Tweets, First International Conference on Arabic Computational Linguistics (ACLing), IEEE. pp. 78-85, 2015.
Zhou, C., Sun, C., Liu, Z. & Lau, F., A C-LSTM Neural Network for Text Classification, arXiv preprint arXiv:1511.08630, 2015.
Sosa, P. M., Twitter Sentiment Analysis Using Combined LSTM-CNN Models, Eprint Arxiv, pp. 1-9, 2017.
Cho, M., Ha, J., Park, C., & Park, S., Combinatorial feature Embedding based on CNN and LSTM for Biomedical Named Entity Recognition, Journal of Biomedical Informatics, 103, 103381, 2018.
Wang, X., Liu, Y., Sun, C., Wang, B. & Wang, X., Predicting Polarities of Tweets by Composing Word Embeddings with Long Short-Term Memory, Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 1343-1353, 2015.
Guggilla, C., Miller, T. & Gurevych, I., CNN- and LSTM-based claim classification in Online User Comments, Proceedings of the International Conference on Computational Linguistics (COLING), pp. 2740-2751, 2016.
Huang, M., Qian, Q. & Zhu, X., Encoding Syntactic Knowledge in Neural Networks for Sentiment Classification, ACM Transactions on Information Systems, 35(3), pp. 1-27, 2017.
Yu, J. & Jiang, J., Learning Sentence Embeddings with Auxiliary Tasks for Cross-Domain Sentiment Classification, Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 236-246, 2016.
Selvaretnam, B. & Belkhatir, M., Natural Language Technology and Query Expansion: Issues, State-of-the-Art And Perspectives, Journal of Intelligent Information Systems , 38(3), pp. 709-740, 2012.
Agarwal, B., Mittal, N., Bansal, P. & Garg, S., Sentiment Analysis Using Common-Sense and Context Information, Computational Intelligence and Neuroscience, 2015.
Leacock, C. & Chodorow, M., Combining Local Context and Wordnet Similarity for Word Sense Identification, WordNet: An Electronic Lexical Database, 49(2), pp. 265-283, 1998.
Rieh, S.Y., Analysis of Multiple Query Reformulations on the Web: the Interactive Information Retrieval Context, Information Processing & Management, 42(3), pp. 751-768, 2006.