Understanding Transformers: A Step-by-Step Math Example — Part 1
<p>I understand that the transformer architecture may seem scary, and you might have encountered various explanations on YouTube or in blogs. However, in my blog, I will make an effort to clarify it by providing a comprehensive numerical example. By doing so, I hope to simplify the understanding of the transformer architecture.</p>
<p>Shoutout to <a href="https://www.youtube.com/@HeduMathematicsofIntelligence" rel="noopener ugc nofollow" target="_blank">HeduAI</a> for providing clear explanations that have helped clarify my own concepts!</p>
<p><strong>Let’s get Started!</strong></p>
<h1>Inputs and Positional Encoding</h1>
<p>Let’s solve the initial part where we will determine our inputs and calculate positional encoding for them.</p>
<p><img alt="" src="https://miro.medium.com/v2/resize:fit:700/1*eBg0WY6510NaFwP94G9Zog.png" style="height:445px; width:700px" /></p>
<h2>Step 1 (Defining the data)</h2>
<p>The initial step is to define our <strong>dataset (corpus)</strong>.</p>
<p><img alt="" src="https://miro.medium.com/v2/resize:fit:700/1*SlziEl8zomWrZbraMkWA3A.png" style="height:138px; width:700px" /></p>
<p>In our dataset, there are <strong>3 sentences (dialogues) </strong>taken from the <strong>Game of Thrones </strong>TV show. Although this dataset may seem small, its size actually helps us in finding the results using the upcoming mathematical equations.</p>
<h2>Step 2 (Finding the Vocab Size)</h2>
<p>To determine the vocabulary size, we need to identify the total number of unique words in our dataset. This is crucial for encoding (i.e., converting the data into numbers).</p>
<p><a href="https://blog.gopenai.com/understanding-transformers-a-step-by-step-math-example-part-1-a7809015150a"><strong>Click Here</strong></a></p>