Our laboratory’s research is focused on natural language technologies to assist human intelligent activities. Our major challenges include the following subjects in text and media studies that are based on machine learning including deep learning, statistical modeling and analysis, or annotation and corpus analysis.
- Information Linkage: Entity identification and information recommendation
- Machine reading comprehension: Computer understanding of natural languages; Semantic analysis and knowledge extraction
- Human language activities: Measurement and modeling of human language activities through text
We are involved in the following research projects while studying the fundamental technology of text and language processing by a computer.
Design of language understanding tasks to establish common grounds for understanding in humans and machines (Research topics)
For humans and computers to communicate via natural language text, it is necessary that the understanding (interpretation) of the given texts be shared. This study addresses issues that arise in creating a common ground for natural language understanding. Especially, what is emphasized in system design in today’s language comprehensions systems underpinning deep learning is the design of language comprehension tasks including data collection and evaluation criteria. We study methods to measure skills that are demanded for language understanding and to collect cases required for training through the analysis and design of machine reading comprehension and natural language communication.
Language interface for scientific writing assistance (Research topics)
We are studying the semantic indexing method for associating other texts semantically similar and recommending them as an example to the arbitrary sites of the text read and written by a user in this project. Moreover, our subjects include paraphrasing, error correction, and template extraction for sentence generation, with the aim of developing a practical application that assists paper writing in English for non-native speakers.
DeepNLP technologies to support information access (Research topics)
Today’s search engine provides a means to efficiently search huge and heterogeneous document collections. However, it is not easy to find useful information from a large number of candidates obtained as a result of the search, and this problem cannot be solved by merely devising the search engine document ranking. In this research, we focus on natural language processing methods useful for efficient interactive search. The research topics include NLP technologies such as context/topic-aware sentence compression, automatic question generation with a knowledge base, and argumentative text detection for decision making support.
Logical and semantic structure analysis of research papers for scientific knowledge acquisition (Research topics)
Development of information technology has enabled us to process a huge quantity of language texts using a computer. However, fundamental information is not necessarily expressed in a text in a favorable fashion. Knowledge embedded into a text only becomes available through the understanding of its reader. Consequently, it is a great challenge for artificial intelligence to comprehend the fundamental function of human intelligence in which knowledge is transformed, combined, and used. We are conducting research on knowledge acquisition using language analysis by a computer and application of knowledge acquired, to undertake a more fundamental task: search for a widespread knowledge space generated by operations such as generalization, integration, and inference.
Retrieval and understanding of mathematical knowledge (Research topics)
A formula is mathematical notation used in diverse scenarios of science and education, and plays an important role in various scientific disciplines. However, because a formula is a nonverbal expression, it has been examined only slightly to date as a research subject in natural language processing. We regard a formula not as a kind of image or symbol string but as a component of a document carrying a special structure and interpretation, and analyze a formula associating its description to study of a language processing approach for handling the semantics of a formula. Our research goal is the implementation of an application base of mathematical knowledge through mathematical knowledge search and development and evaluation of an understanding support system using these component technologies.
“Science of reading” — gaze-based analysis and application of reading process (Research topics)
Language activities through the screen of an electronic terminal are indispensable to our everyday life. We specifically investigate a human act of “reading” on a screen and performing research on its measurement, modeling, and support in this project. Specifically, we regard an act of “reading” a text on a screen as the interaction of the following three factors: (1) semantic structure of an object text; (2) image features, such as a layout and character decoration; and (3) a reader’s visual sense and language cognitive process. Our goal is to develop a method for presenting a text in a readable form, as well as studying the measurement and modeling method.