Why should we care about this paper?

Transfer learning is a promising approach in NLP and language understanding. The idea is that you pretrain models on huge amounts of data in order to learn general-purpose knowledge. This pretraining can be done in supervised v.s. unsupervised. They can use what's learned from the more general task to perform on specific downstream tasks. An example of this is GPT-2, pretrained on a lot of the internet, then it was evaluated on specific tasks that it was not specifically trained for.

This paper is a survey paper of transfer learning and introduces a suggested standardization for transfer learning for NLP.

Big ideas:

Contributions:

way to approach transfer learning (Text-to-Text Transfer Transformer aka T5), dataset (Colossal Clean Crawled Corpus) , pre-trained models

side comments:

is it normal to add a technical contribution to a survey paper?

link to paper results:

https://paperswithcode.com/paper/exploring-the-limits-of-transfer-learning