Software

Software and Tools we have released

Software and Tools

Software and Tools on this page are available free of charge for educational, research, and in-house uses. For information on commercial use of any of these tools, please contact Columbia Technology Ventures, email: techventures@columbia.edu, phone number: (+1) 212-854-8444.

We additionally have the following online repositories for our research artifacts:

  1. Columbia NLP Huggingface
  2. Columbia NLP Lab GitHub

Here is an incomplete list of software and tools Columbia NLP has developed.


Narrative Summarization Corpus

Developed by Jessica Ouyang, Serina Chang, and Kathleen McKeown

Described in Crowd-Sourced Iterative Annotation for Narrative Summarization Corpora. Personal narratives with aligned extractive and abstractive summaries. Available under MIT License.

Download


Gendered Corpus

Developed by Serina Chang and Kathleen McKeown

Described in Automatically Inferring Gender Associations from Language. Online articles written about celebrities and online reviews written by students about professors. Labeled for gender.

Download


Opinionated Claims

Developed by Sara Rosenthal and Kathleen McKeown

Described in Detecting Opinionated Claims in Online Discussions. Wikipedia and LiveJournal. Sentence level annotations of opinionated claims and phrase based sentiment.

Download

Wikipedia Talk Pages Agreement Corpus

Download

Create Debate Agreement Corpus

Download

Sentence Fusion Corpus

Download

Text-to-text generation

GitHub Repository

Quoted Speech Attribution Corpus

Licensing Agreement

MADA

More

LCseg

Licensing Agreement

LexChainer

Licensing Agreement

LinkIT

Licensing Agreement

Centrifuser

Licensing Agreement

Annotated Bibliography Corpus

Licensing Agreement

FUF

Download

CFUF

More


Surge


CREP


Segmenter


Verber