
Summary
- Evaluate PASE, a contrastive self-supervised learning framework, on music.
- Pre-train the model on the AudioSet music subset, and evaluate on Open-Mic (multiple instruments), Ballroom (dance genre), and FMA (genre).
- Consider and evaluate different training recipes: weighting mechanisms, impacts by including different workers/tasks, and data efficiency.
Thoughts
- It would be interesting to see how PASE-style representation learners compete with contrastive methods, for general purpose audio representation as shown in ‣.