Computational Linguistics Scientific Document Summarization Shared Task

CL-SciSumm 2017


Artificial Intelligence



** Call for Participation **
The 3rd CL-SciSumm 2017 Shared Task
at SIGIR 2017 on Friday, August 11, 2017
To be held as a part of the
2nd Joint Workshop of Bibliometric-enhanced IR and NLP for Digital Libraries (BIRNDL)
Sponsored by Microsoft Research Asia
Introduction
-------------------
We invite you to participate in our Shared Task on the relationship mining and scientific summarization of computational linguistics research papers. Scientific summarization can play an important role in developing methods to index, represent, retrieve, browse and visualize information in large scholarly databases.
Objective
-------------------
The 3rd CL-SciSumm Shared Task provides resources to encourage research in entity mining, relationship extraction, question answering and other NLP tasks for scientific papers. It comprises annotated citation information connecting research papers with citing papers. Citations are embedded with meta-commentary, which offer a contextual, interpretative layer to the cited text and emphasize certain kinds of information over others.
The Task
------------------
The task comprises a set of topics, each consisting of a research paper (RP) in CL, and ten or more papers which cite it (citing papers, CP). The text spans (citances) which relate the citing paper to the reference paper have already been identified.
Task 1a: For each citance, identify the cited text span in the RP that most accurately reflect the citance.
Task 1b: For each cited text span, identify what facet of the paper it belongs to, from a predefined set of facets.
Evaluation: Task 1 will be scored by overlap of text spans in the system output vs the gold standard created by human annotators
Task 2: (optional bonus task): Finally, generate a structured summary of the RP from the cited text spans of the RP. The length of the summary should not exceed 250 words.
Evaluation: Task 2 will be scored using the ROUGE evaluation metric to compare automatic summaries against paper abstracts, human written summaries and community summaries constructed using the output of Task 1a.
How To Participate
-------------------
1. Register for the CL-SciSumm Shared Task at (https://easychair.org/conferences/?conf=birndl2017) by May 20
2. Browse our git repository at (https://github.com/WING-NUS/scisumm-corpus) and download the training set.
3. Develop and train your system to solve Task 1a, 1b and/or Task 2 on the training set.
4. Meanwhile, submit a tentative system description, by May 20.
5. Evaluate your system on the test set, to be released on () and upload your results to () to self-evaluate your performance.
6. Tell us about your approach in a paper.
7. Attend the BIRNDL workshop at SIGIR on August 11, and present your work.
Important Dates
-------------------
Registration opens: April 20, 2017
Training set posted: May 1, 2017
Short system description: May 20, 2017
Test Set posted and evaluation period begins: July 1, 2017
Evaluation period ends: July 15, 2017
System reports (papers) due: July 30, 2017
Camera ready contributions due: TBD
Presentation at 2nd BIRNDL 2017 workshop, SIGIR: Aug 11, 2017
Organizers
-------------------
Kokil Jaidka, University of Pennsylvania
Muthu Kumar Chandrasekaran, National University of Singapore
Min-Yen Kan, National University of Singapore