Subtask 1

IMPORTANT: please note that there is a closed and an open track for this subtask!

In subtask 1 the goal is to predict labels for each text in a dataset where the labels are derived from the original labels assigned by several human annotators.

The human annotators assigned (according to the annotation guidelines ) the strength of misogyny/sexism present in the given text via the following labels:

While the annotation guidelines define what kind of sexism/misogyny should get annotated, there has been made no attempt to give rules about how to decide on the strength. For this reason, if an annotator decided that sexism/misogyny is present in a text, the strength assigned is a matter of personal judgement.

The labels to predict in subtask 1 reflect different strategies for how multiple labels from annotators can be used to derive a final target label:

Data

For the trial phase of subtask 1, we provide a small dataset, containing

For the development phase of subtask 1, we provide all participants with the following data:

For the competition phase of subtask 1, we provide

All of the five files are in JSONL format (one JSON-serialized object per line) where each object is a dictionary with the following fields:

You can download the data for each phase as soon as the corresponding phase starts.

Submission

Your submission must be a file in TSV (tab separated values) format which contains the following columns in any order:

Note that the way how you derive those labels is up to you (as long as the rules for the closed or open tracks are followed):

To submit your predictions to the competition:

Submission errors and warnings

Phases

Evaluation

System performance on all five predicted labels (bin_maj, bin_one, bin_all, multi_maj, disagree_bin) is evaluated using F1 macro score over all classes.

The final score which is used for ranking the submissions is calculated as the unweighted average over all 5 scores.

Following a successful submission, you need to refresh the web page in order to see your score and your result on the leaderboard.