Software
Software and Tools we have released
Software and Tools
Software and Tools on this page are available free of charge for educational, research, and in-house uses. For information on commercial use of any of these tools, please contact Columbia Technology Ventures, email: techventures@columbia.edu, phone number: (+1) 212-854-8444.
We additionally have the following online repositories for our research artifacts:
- Columbia NLP Huggingface
- Columbia NLP Lab GitHub
Here is an incomplete list of software and tools Columbia NLP has developed.
- Narrative Summarization Corpus
- Gendered Corpus
- Opinionated Claims Corpus
- Wikipedia Talk Pages Agreement Corpus
- Create Debate Agreement Corpus
- Sentence Fusion Corpus
- Text-to-text generation
- Quoted Speech Attribution Corpus
- MADA
- LCseg
- LexChainer
- LinkIT
- Centrifuser
- Annotated Bibliography Corpus
- FUF
- CFUF
- Surge
- CREP
- Segmenter
- Verber
Narrative Summarization Corpus
Developed by Jessica Ouyang, Serina Chang, and Kathleen McKeown
Described in Crowd-Sourced Iterative Annotation for Narrative Summarization Corpora. Personal narratives with aligned extractive and abstractive summaries. Available under MIT License.
Gendered Corpus
Developed by Serina Chang and Kathleen McKeown
Described in Automatically Inferring Gender Associations from Language. Online articles written about celebrities and online reviews written by students about professors. Labeled for gender.
Opinionated Claims
Developed by Sara Rosenthal and Kathleen McKeown
Described in Detecting Opinionated Claims in Online Discussions. Wikipedia and LiveJournal. Sentence level annotations of opinionated claims and phrase based sentiment.