PSB 2016 Social Media Mining Shared Task Workshop



Task 1 Description

Task 2 Description

Task 3 Description






Task 1

Task brief:
The first proposed sub-task focuses on automatic classification of ADR assertive user posts. This task will utilize the binary annotations in the data. Participants will be provided with a training/development set, containing the annotations. Evaluation will be performed on a blind set not released prior to the evaluation deadline. Systems will be evaluated on their ability to automatically classify ADR containing posts.
The training data consists of 7,574 instances (~70% of the original corpus) containing binary annotations. The evaluation set consists of 3,284 instances with a similar ADR to nonADR ratio as the training set. For each tweet, the publicly available data set contains: (i) the user ID, (ii) the tweet ID, and (iii) the binary annotation indicating the presence or absence of ADRs, as shown below. The evaluation data will contain the same information, but without the classes. Participating teams should submit their results in the same format as the training set (shown below).
User ID Tweet ID Class
349294537367236611 149749939 0
354256195432882177 54516759 0
352456944537178112 1267743056 1

Training Data:
download link
The download script for the training data: python download script
Downloading instructions and details about the training data can be found here.

Information about binary classification using this data can be found here.

Test Data:
Test data will be made available here.
les for this task should be of the format: teamname_teamnumber_1





© DIEGO LAB 2015 Competition Organisers.