Intuitions on L1 and L2 Regularisation
<p>O<strong>verfitting</strong> is a phenomenon that occurs when a machine learning or statistics model is tailored to a particular dataset and is unable to generalise to other datasets. This usually happens in complex models, like deep neural networks.</p>
<p><strong>Regularisation</strong> is a process of introducing additional information in order to prevent overfitting. The focus for this article is L1 and L2 regularisation.</p>
<p>There are many explanations out there but honestly, they are a little too abstract, and I’d probably forget them and end up visiting these pages, only to forget again. In this article, I will be sharing with you some intuitions why L1 and L2 work by explaining using <strong>gradient descent</strong>. Gradient descent is simply a method to find the ‘right’ coefficients through iterative updates using the value of the gradient. (This <a href="https://towardsdatascience.com/step-by-step-tutorial-on-linear-regression-with-stochastic-gradient-descent-1d35b088a843" rel="noopener" target="_blank">article</a> shows how gradient descent can be used in a simple linear regression.)</p>
<p><a href="https://towardsdatascience.com/intuitions-on-l1-and-l2-regularisation-235f2db4c261"><strong>Click Here</strong></a></p>