Table of contents
🏷️ Dataset: data-v2
data-v2 contains labeled training and test data with and without “ambiguous” tags
🏷️ Dataset: data-v1
data-v1 contains labeled ONLY training data with and without “ambiguous” tags
- To label the dataset, I used these heuristics
- Train: 72 / 626 (12%) ambiguous labels
- GitHub
Overview and Structure
Original dataset on Kaggle