The Era of Large AI Models Is Over

<p>Just kidding: we all know size matters. It&rsquo;s definitely true for AI models, especially for those trained on text data, i.e., language models (LMs). If there&rsquo;s one trend that has, above all others, dominated AI in the last five or six years, it is the steady increase in parameter count of the best models, which I&rsquo;ve seen referred to as&nbsp;<a href="https://huggingface.co/blog/large-language-models" rel="noopener ugc nofollow" target="_blank">Moore&rsquo;s law for large LMs</a>. The GPT family is the clearest &mdash; albeit not the only &mdash; embodiment of this fact: GPT-2 was 1.5 billion parameters, GPT-3 was 175 billion, ~100x its predecessor, and rumors have it that GPT-4&rsquo;s size, although officially undisclosed, has reached the&nbsp;<a href="https://www.semafor.com/article/03/24/2023/the-secret-history-of-elon-musk-sam-altman-and-openai" rel="noopener ugc nofollow" target="_blank">1 trillion mark</a>. Not an exponential curve but definitely a growing one.</p> <p>OpenAI was categorically following the godsend guidance of the scaling laws they themselves&nbsp;<a href="https://arxiv.org/abs/2001.08361" rel="noopener ugc nofollow" target="_blank">discovered</a>&nbsp;in 2020 (that DeepMind later&nbsp;<a href="https://arxiv.org/abs/2203.15556" rel="noopener ugc nofollow" target="_blank">refined</a>&nbsp;in 2022). The main takeaway is that size matters a lot. DeepMind revealed that other variables like the amount of training data, or its quality, also influence performance.&nbsp;But a truth we can&rsquo;t deny is that we love nothing more than&nbsp;<em>a bigger thing</em>: Model size has been the gold standard for heuristically measuring how good an AI system would be.</p> <p><a href="https://medium.com/@albertoromgar/the-era-of-large-ai-models-is-over-a5c9d7d804d4"><strong>Read More</strong></a></p>
Tags: Over