Automated Feature Selection for Machine Learning in Python
<p>Feature selection is the process of identifying the most important and informative features within a dataset. It is one of the most important steps of machine learning modeling pipeline, since it has significant impact on model performance and its predictive power.</p>
<p><img alt="" src="https://miro.medium.com/v2/resize:fit:694/1*9SWf1yNp5a8aQJ8iz8hstg.png" style="height:396px; width:694px" /></p>
<p>Simple Visualization of Feature Selection</p>
<p><strong>Benefits of performing feature selection:</strong></p>
<ul>
<li>Improved model performance & reduced complexity (<em>Curse of Dimensionality</em>)</li>
<li>Reduced training time</li>
<li>Diminished risk of overfitting coming from uninformative & redundant features</li>
<li>Simplified deployment processes and live data pipelines, an often underestimated advantage.</li>
</ul>
<p>Despite its numerous benefits, feature selection can be overlooked during machine learning model development due to the tight deadlines in real-life projects and underestimation of its effects on performance. Also there are high number of feature selection methods and not knowing/deciding which one(s) to use may lead to the omission of this crucial step.</p>
<p>A potential solution to overcome time related challenges and to leverage the positive impact of feature selection on machine learning models is the implementation of reusable, partially or fully automated code.</p>
<p><a href="https://python.plainenglish.io/automated-feature-selection-for-machine-learning-in-python-2ad4bcfac19a">Website</a></p>