Paraphrasing Method Based on Contextual Synonym Substitution

Ari Moesriami Barmawi, Ali Muhammad

Abstract


Generating paraphrases is an important component of natural language processing and generation. There are sev­eral applications that use paraphrasing, for example linguistic steganography, recommender systems, machine translation, etc. One method for paraphrasing sentences is by using synonym substitution, such as the NGM-based paraphrasing method proposed by Gadag et al. The weakness of this method is that ambiguous meanings frequently occur because the paraphrasing process is based solely on n-gram. This negatively affects the naturalness of the paraphrased sentences. For overcoming this problem, a contextual synonym substitution method is proposed, which aims to increase the naturalness of the paraphrased sentences. Using the proposed method, the paraphrasing process is not only based on n-gram but also on the context of the sentence such that the naturalness is increased. Based on the experimental result, the sentences generated using the proposed method had higher naturalness than the sentences generated using the original method.


Keywords


context; language; paraphrasing; synonym; substitution

Full Text:

PDF

References


Pantel, P. & Lin, D., Discovery of Inference Rules from Text, Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Data Mining, pp. 323-328, 2001.

Niu, C., Zhou, M., Liu, T., Zhao, S. and Li, S., Combining Multiple Resources to Improve SMT-based Paraphrasing of the Model, Proceedings of the 46th Annual Meeting of ACL, 2008.

Ioannis, A.V., Mittal, T.V., Riezler, S. & Liu, Y., Statistical Machine Translation for Query Expansion in Answer Retrieval, Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 464-471, 2007.

Snover, M., Madnani, M., Dorr, B.J. & Schwartz, R., Terplus: Paraphrase, Semantic, and Alignment Enhancements to Translation Edit Rate, Machine Translation, 23(2-3), pp. 117-127, 2010.

Durme, B. V., Callison-Burch, C. & Ganitkevitch, J., PPDB: The Paraphrase Database, in Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 758-764, 2013.

Gadag, A. I. & Sagar, B.M., N-gram Based Paraphrase Generator from Large Text Document, 2016 International Conference on Computation System and Information Technology for Sustainable Solutions (CSITSS), pp. 91-94, 2016.

Arman, A.A., Putra, B.A., Purwarianti, A. & Kuspriyanto, Syntactic Phrase Chunking for Indonesian Language, Proceedings of ICEEI, pp. 635-640, 2013.

Wicaksono, A.F. & Purwarianti, A., HMM-based Part-of-speech Tagger for Bahasa Indonesia, Proceedings of 4th International Malindo Workshop, 2010.

Winstein, K., Tyrannosaurus lex. Open source. Available at http://web.mit.edu/keithw/tlex, 1999.

Chang, C. & Clark, S., Practical Linguistic Steganography using Contextual Synonym Substitution and a Novel Vertex Coding Method, Computational Linguistic, 40(2), pp. 403-448, 2014.

Twitter, @tempodotco, 129210 tweets, 01 January 2012-10 June 2016, 10:20 a.m., https://twitter.com/tempodotco.

Twitter, @kompasdotcom, 3410 tweets, 09 February 2012-18 June 2012, 12:32 p.m., https://twitter.com/kompasdotcom.

Twitter, @kompascom, 81759 tweets, 03 March 2015-01 June 2016, 06:54 a.m., https://twitter.com/kompascom.

Denkowski, D. & Lavie, A., Meteor Universal: Language Specific Translation Evaluation for Any Target Language, Proceedings of the EACL Workshop on Statistical Machine Translation, 2014.

Lavie, A., Sagae, K. & Jayaraman, S., The Significance of Recall in Automatic Metrics for MT Evaluation, Proceedings of Conference of the Association for Machine Transition in the Americas (AMTA), pp. 134-143, 2004.

Banerjee, S and Lavie, A., Meteor: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments, Proceedings of Workshop on Intrinsic and Extrinsic Evaluation Measures for MT and/or Summarization at the 43th Annual Meeting of the Association of Computational Linguistics (ACL-2005), Ann Arbor, Michigan, 2005.

Twitter, @Pikiran_rakyat, 76205 tweets, 14 July 2009-17 December 2019, 13:02 p.m., https://twitter.com/pikiran_rakyat.

Twitter, @Jawapos, 47187 tweets, 1 January 2019-17 December 2019, 13:02 p.m., https://twitter.com/jawapos.




DOI: http://dx.doi.org/10.5614%2Fitbj.ict.res.appl.2019.13.3.6

Refbacks

  • There are currently no refbacks.


Contact Information:

ITB Journal Publisher, LPPM – ITB, 

Center for Research and Community Services (CRCS) Building Floor 7th, 
Jl. Ganesha No. 10 Bandung 40132, Indonesia,

Tel. +62-22-86010080,

Fax.: +62-22-86010051;

e-mail: jictra@lppm.itb.ac.id.