This is a collection of notes for the conference, biased to my research interests and focused on self-supervised learning, disentanglement, and discrete representation for music and speech.
Without reading the papers into much detail, there could be (many) errors. Please feel free to correct me!
SIMILARITY ANALYSIS OF SELF-SUPERVISED SPEECH REPRESENTATIONS
A Comparison Of Discrete Latent Variable Models For Speech Representation Learning
MULTI-TASK SELF-SUPERVISED PRE-TRAINING FOR MUSIC CLASSIFICATION
CONTRASTIVE LEARNING OF GENERAL-PURPOSE AUDIO REPRESENTATIONS
Self-Supervised VQ-VAE For One-Shot Music Style Transfer
PITCH-TIMBRE DISENTANGLEMENT OF MUSICAL INSTRUMENT SOUNDS BASED ON VAE-BASED METRIC LEARNING
AN ITERATIVE FRAMEWORK FOR SELF-SUPERVISED DEEP SPEAKER REPRESENTATION LEARNING
CONTRASTIVE SEPARATIVE CODING FOR SELF-SUPERVISED REPRESENTATION LEARNING