We are studying the semantic indexing method for associating other texts semantically similar and recommending them as an example to the arbitrary sites of the text read and written by a user in this project. Moreover, our subjects include paraphrasing, error correction, and template extraction for sentence generation, with the aim of developing a practical application that assists paper writing in English for non-native speakers.
Extracting and retrieving formulaic expressions for scientific writing assistance
Although many formulaic expression dictionaries are published to help non-native English speakers write research papers, most of these resources are not machine readable nor hard to use conveniently and handily during writing. In this study, we develop a method of presenting appropriate formulaic expressions based on a user’s ambiguous input. Specifically, we aim at improving the prediction accuracy of formulaic expressions by utilizing rich context information with domain-specific expressions and discourse structures that were extracted automatically from large domain corpora. (Iwatsuki et al.: to be presented at COLING-2018)
Distributdc representation for similar sentence search
We propose a neural network that learns the semantic expression of phrases based on a novel standard called Inclusion criterion, which has implemented a multilanguage similar sentence search function. Furthermore, we have applied the proposed method to a Japanese–English bilingual corpus extracted from papers, and prepared a demonstration tool for English composition support CroVeWA. (Hubert et al.: NAACL HLT 2015 demo [1], ICLR 2015 short [2])