Tag: Transformers

Understanding Transformers: A Step-by-Step Math Example — Part 1

I understand that the transformer architecture may seem scary, and you might have encountered various explanations on YouTube or in blogs. However, in my blog, I will make an effort to clarify it by providing a comprehensive numerical example. By doing so, I hope to simplify the understanding of the...

Understanding Transformers: A Step-by-Step Math Example — Part 1

I understand that the transformer architecture may seem scary, and you might have encountered various explanations on YouTube or in blogs. However, in my blog, I will make an effort to clarify it by providing a comprehensive numerical example. By doing so, I hope to simplify the understanding of the...

Understanding Transformers: A Step-by-Step Math Example — Part 1

I understand that the transformer architecture may seem scary, and you might have encountered various explanations on YouTube or in blogs. However, in my blog, I will make an effort to clarify it by providing a comprehensive numerical example. By doing so, I hope to simplify the understanding of the...

Vision Transformers vs. Convolutional Neural Networks

This blog post is inspired by the paper titled AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE from google’s research team. The paper proposes using a pure Transformer applied directly to image patches for image classification tasks. The Vision Transformer ...

T5: Text-to-Text Transformers (Part One)

The transfer learning paradigm is comprised of two main stages. First, we pre-train a deep neural network over a bunch of data. Then, we fine-tune this model (i.e., train it some more) over a more specific, downstream dataset. The exact implementation of these stages may take many different forms. I...

Vision Transformers vs. Convolutional Neural Networks

This blog post is inspired by the paper titled AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE from google’s research team. The paper proposes using a pure Transformer applied directly to image patches for image classification tasks. The Vision Transformer ...

T5: Text-to-Text Transformers (Part One)

The transfer learning paradigm is comprised of two main stages. First, we pre-train a deep neural network over a bunch of data. Then, we fine-tune this model (i.e., train it some more) over a more specific, downstream dataset. The exact implementation of these stages may take many different forms. I...

Using Transformers to Cut Waste and Put Smiles on Our Customers’ Faces!

Transformer models have revolutionized natural language processing with their human-like text generation capabilities. At Picnic, we’re using these models for demand forecasting to minimize waste, increase customer satisfaction, and support sustainability. We even experimented with using trans...

Transformers Well Explained: Masking

Masking is simply the act of hiding a word and asking the model to predict it. Like in the following “Being strong is <mask> what matters”. It is a different task that forces the model to generate an embedding with more contextual semantics in theory. When we ask a model to pred...