A fAIry tale of the Inductive Bias
<p>As we have seen in recent years deep learning has had exponential growth both in use and in the number of models. What paved the way for this success is perhaps the <a href="https://en.wikipedia.org/wiki/Transfer_learning" rel="noopener ugc nofollow" target="_blank">transfer learning</a> itself-the idea that a model could be trained with a large amount of data and then used for a myriad of specific tasks.</p>
<p>In recent years, a paradigm has emerged: <a href="https://en.wikipedia.org/wiki/Transformer_(machine_learning_model)" rel="noopener ugc nofollow" target="_blank">transformer</a> (or otherwise based on this model) is used for NLP applications. While for images, <a href="https://en.wikipedia.org/wiki/Vision_transformer" rel="noopener ugc nofollow" target="_blank">vision transformers</a> or <a href="https://en.wikipedia.org/wiki/Convolutional_neural_network" rel="noopener ugc nofollow" target="_blank">convolutional networks</a> are used instead.</p>
<h2><a href="https://towardsdatascience.com/the-infinite-babel-library-of-llms-90e203b2f6b0?source=post_page-----d418fc61726c--------------------------------" rel="noopener follow" target="_blank">The Infinite Babel Library of LLMs</a></h2>
<h3><a href="https://towardsdatascience.com/the-infinite-babel-library-of-llms-90e203b2f6b0?source=post_page-----d418fc61726c--------------------------------" rel="noopener follow" target="_blank">Open-source, data, and attention: How the future of LLMs will change</a></h3>
<p><a href="https://towardsdatascience.com/the-infinite-babel-library-of-llms-90e203b2f6b0?source=post_page-----d418fc61726c--------------------------------" rel="noopener follow" target="_blank">towardsdatascience.com</a></p>
<h2><a href="https://towardsdatascience.com/metas-hiera-reduce-complexity-to-increase-accuracy-30f7a147ad0b?source=post_page-----d418fc61726c--------------------------------" rel="noopener follow" target="_blank">META’s Hiera: reduce complexity to increase accuracy</a></h2>
<h3><a href="https://towardsdatascience.com/metas-hiera-reduce-complexity-to-increase-accuracy-30f7a147ad0b?source=post_page-----d418fc61726c--------------------------------" rel="noopener follow" target="_blank">Simplicity allows AI to reach incredible performance and surprising speed</a></h3>
<p><a href="https://towardsdatascience.com/metas-hiera-reduce-complexity-to-increase-accuracy-30f7a147ad0b?source=post_page-----d418fc61726c--------------------------------" rel="noopener follow" target="_blank">towardsdatascience.com</a></p>
<p>On the other hand, while we have plenty of work showing in practice that these models work well, the theoretical understanding of why has lagged behind. This is because these models are very broad and it comes difficult to experiment. The fact that <a href="https://en.wikipedia.org/wiki/Vision_transformer" rel="noopener ugc nofollow" target="_blank">Vision Transformers</a> outperform convolutional neural networks <a href="https://towardsdatascience.com/metas-hiera-reduce-complexity-to-increase-accuracy-30f7a147ad0b" rel="noopener" target="_blank">by having a theoretically less inductive bias for vision</a> shows that there is a theoretical gap to be filled.</p>
<p><a href="https://towardsdatascience.com/a-fairy-tale-of-the-inductive-bias-d418fc61726c"><strong>Learn More</strong></a></p>