Why are models such as BERT or GPT-3 considered unsupervised learning - Enhance your coding expertise with danielkim9 on @onlycoders.net

2 years ago

#190241

danielkim9

Why are models such as BERT or GPT-3 considered unsupervised learning during pre-training when there is an output (label)

I am not very experienced with unsupervised learning, but my general understanding is that in unsupervised learning, the model learns without there being an output. However, during pre-training in models such as BERT or GPT-3, it seems to me that there is an output. For example, in BERT, some of the tokens in the input sequence are masked. Then, the model will try to predict those words. Since we already know what those masked words originally were, we can compare that with the prediction to find the loss. Isn't this basically supervised learning?

machine-learning

bert-language-model

unsupervised-learning

0 Answers

Your Answer

Posts

Questions

Blogs