GPT-4’s Secret Has Been Revealed

GPT-4 was the most anticipated AI model in history. Yet when OpenAI released it in March they didn’t tell us anything about its size, data, internal structure, or how they trained and built it. A true black box. As it turns out, they didn’t conceal those critical details because the model was too innovative or the architecture too moat-y to share. The opposite seems to be true if we’re to believe the latest rumors: GPT-4 is, technically and scientifically speaking, hardly a breakthrough. That’s not necessarily bad — GPT-4 is, after all, the best language model in existence — just… somewhat underwhelming. Not what people were expecting after a 3-year wait. This news, yet to be officially confirmed, reveals key insights about GPT-4 and OpenAI and raises questions about AI’s true state-of-the-art — and its future. <h1>GPT-4: A mixture of smaller models</h1> On June 20th, <a href="https://twitter.com/swyx/status/1671272883379908608" rel="noopener ugc nofollow" target="_blank">George Hotz</a>, founder of self-driving startup Comma.ai leaked that GPT-4 isn’t a single monolithic dense model (like GPT-3 and GPT-3.5) but a mixture of 8 x 220-billion-parameter models. Later that day, <a href="https://twitter.com/soumithchintala/status/1671267150101721090" rel="noopener ugc nofollow" target="_blank">Soumith Chintala</a>, co-founder of PyTorch at Meta, reaffirmed the leak. Just the day before, <a href="https://twitter.com/MParakhin/status/1670666605427298304" rel="noopener ugc nofollow" target="_blank">Mikhail Parakhin</a>, Microsoft Bing AI lead, had also hinted at this. <a href="https://albertoromgar.medium.com/gpt-4s-secret-has-been-revealed-439db1568180">Read More</a>