Understanding Large Language Models: The Physics of (Chat)GPT and BERT

ChatGPT, or more broadly Large Language AI Models (LLMs), have become ubiquitous in our lives. Yet, most of the mathematics and internal structures of LLMs are obscure knowledge to the general public. So, how can we move beyond perceiving LLMs like ChatGPT as magical black boxes? Physics may provide an answer. Everyone is somewhat familiar with our physical world. Objects such as cars, tables, and planets are composed of trillions of atoms, governed by a simple set of physical laws. Similarly, complex organisms, like ChatGPT, have emerged and are capable of generating highly sophisticated concepts like arts and sciences. It turns out that the equations of the building blocks of LLMs, are analogous to our physical laws. So that by understanding how complexity arises from our simple physical laws, we might be able to gleam some insight on how and why LLMs work. <a href="https://towardsdatascience.com/understanding-large-language-models-the-physics-of-chat-gpt-and-bert-ea512bcc6a64">Read More</a>