大量の文書の中から有用な情報を見つけることは容易なことではありません。本研究では、素早くわかりやすく文書の内容を伝えるための自然言語処理手法に焦点をあてます。具体的には、トピックや文脈を考慮した文圧縮、知識データベースを活用した質問文自動生成などの研究に取り組みます。
Text Compression and Summarization
Text compression and summarization systems aim to produce a shorter version of a source text by preserving the key contents of the original. However, yielding an informative and grammatical compression (summary) is still a challenge. In this project, we tackle this issue by considering two aspects – the word (local) features such as part-of-speech tag of word and sentence (global) features such as readability of a whole sentence. Our experimental results demonstrate that these features coupled with techniques like deep learning and reinforcement learning can lead to compressions (summaries) with better quality (Yang et al: NLDB-2017 [1]; ACL-2018 short, accepted)