Coreferences are collections of expressions that all represent the same
person, entity, or thing. For example, the text "Lauren loves dogs.
She would walk them all day.", there is a coreference consisting of
the token "Lauren" in the first sentence and the token "She" in the
second sentence. In the output given from this function, a row is
given for any mention of an entity; these can be linked using the
rid
key.
cnlp_get_coreference(annotation)
an annotation object
Returns an object of class c("tbl_df", "tbl", "data.frame")
containing one row for every coreference in the corpus.
The returned data frame includes at least the following columns:
"id" - integer. Id of the source document.
"rid" - integer. Relation ID.
"mid" - integer. Mention ID; unique to each coreference within a document.
"mention" - character. The mention as raw words from the text.
"mention_type" - character. One of "LIST", "NOMINAL", "PRONOMINAL", or "PROPER".
"number" - character. One of "PLURAL", "SINGULAR", or "UNKNOWN".
"gender" - character. One of "FEMALE", "MALE", "NEUTRAL", or "UNKNOWN".
"animacy" - character. One of "ANIMATE", "INANIMATE", or "UNKNOWN".
"sid" - integer. Sentence id of the coreference.
"tid" - integer. Token id at the start of the coreference.
"tid_end" - integer. Token id at the start of the coreference.
"tid_head" - integer. Token id of the head of the coreference.
Manning, Christopher D., Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55-60.
Marta Recasens, Marie-Catherine de Marneffe, and Christopher Potts. The Life and Death of Discourse Entities: Identifying Singleton Mentions. In: Proceedings of NAACL 2013.
Heeyoung Lee, Angel Chang, Yves Peirsman, Nathanael Chambers, Mihai Surdeanu and Dan Jurafsky. Deterministic coreference resolution based on entity-centric, precision-ranked rules. Computational Linguistics 39(4), 2013.
Heeyoung Lee, Yves Peirsman, Angel Chang, Nathanael Chambers, Mihai Surdeanu, Dan Jurafsky. Stanford's Multi-Pass Sieve Coreference Resolution System at the CoNLL-2011 Shared Task. In: Proceedings of the CoNLL-2011 Shared Task, 2011.
Karthik Raghunathan, Heeyoung Lee, Sudarshan Rangarajan, Nathanael Chambers, Mihai Surdeanu, Dan Jurafsky, Christopher Manning A Multi-Pass Sieve for Coreference Resolution. EMNLP-2010, Boston, USA. 2010.