Corpus
Dataset
- 2WikiMultiHopQA: A Multi-hop QA Dataset for Comprehensive Evaluation of Reasoning Steps
- FECFeval: An evaluation dataset for formulaic expression extraction
- OneCommon: A Natural Language Corpus of Common Grounding under Continuous and Partially-Observable Context
- NTCIR-Math: IR Evaluation Task for Math Information Access
- NTCIR-math-annotation: Annotation of math formula descriptions
Tool
- PDFNLT 1.0: Tools for Natural Language Text aware PDF structure analysis for scientific papers
- Planetext Converting XML document into plain text based on tag classification
- FixFix: A web-based editor for fixations detected in gaze datasets of reading activities
- mapPdfToXml: A tool for Extract PDF’s Layout Information and embed it into an XML
Demo
- TermLink: Technical term extraction, Wikification and related paper recommendation
- SideNoter: Scientific paper viewing system (by Takeshi Abekawa)
- i-linkage: Citation identification