Andre Ye
1 min readSep 22, 2020

--

Good point! I haven't seen a lot of few-shot learning in BERT. However, the title for the GPT-3 paper was 'Language Models are Few-Shot Learners', and obviously GPT-3 has a few differences from BERT, but both are technically transformers.

--

--

Andre Ye
Andre Ye

Responses (1)