Twitter Annotated Corpus

In the interest of competitive research, we have made a subset of our hand annotated corpus publicly available. Two types of annotations are currently available: Binary and Span. The binary annotations indicate the presence or absence of ADRs in Tweets. The span annotations (containing spans and normalized UMLS IDs) specify the exact mentions of ADRs in Tweets. This corpus is freely available to download.
  • Binary Annotation Corpus
  • Full Annotation Corpus
      The full annotation corpus consists of 1784 tweets. These corpus was annotated manually for ADR mentions in each Tweet. The annotations include the locations/spans of the mentions and their UMLS IDs. See link below for downloads and the publication associated with this data set.
    • Data set link