BioNLP is an initiative by the Center for Computational
at the University of Colorado Denver
to create and distribute code, software, and data for applying natural
language processing techniques to biomedical texts.
There are many projects associated with BioNLP.
- ASM: an Approximate Subgraph Matching algorithm for dependency graphs
- ESM: an Exact Subgraph Matching algorithm for dependency graphs
a biomedical literature specific lemmatizer.
a repository of biologically and linguistically annotated corpora and
biomedical datasets. This project includes
- Colorado Richly Annotated Full-Text Corpus (CRAFT)
- Annotation Projects
- MEDLINE Mining projects
- Anaphora Corpus
- TestSuite Corpora
- Biomedical Concept Annotation: The raw evaluation data for our paper "Large-scale biomedical concept recognition: an evaluation of current automatic annotators and their parameters"
Unstructured Information Management Architecture (UIMA) components
geared towards the use and evaluation of tools for biomedical natural
langauge processing, including tools for our own OpenDMAP and MutationFinder use.
- common: a library of utility code
for common tasks
- Knowtator: a
Protege plug-in for text annotation.
a code library containing an XML parser for the 2012 Medline XML
an information extraction system for extracting descriptions of point
mutations from free text.
an analysis tool to detect
OBO ontology terms that use different linguistic conventions for
expressing similar semantics.
- OpenDMAP: an
ontology-driven, rule-based concept analysis and information extraction
Classifier: a classifier for the content of parenthesized text
- Simple Semantic
Classifier: a text classifier for OBO domains
- uima-shims: a library of simple interfaces designed to facilitate the development of type-system-independent UIMA components