Automated Feature Selection for Machine Learning in Python

<p>Feature selection is the process of identifying the most important and informative features within a dataset. It is one of the most important steps of machine learning modeling pipeline, since it has significant impact on model performance and its predictive power.</p> <p><img alt="" src="https://miro.medium.com/v2/resize:fit:694/1*9SWf1yNp5a8aQJ8iz8hstg.png" style="height:396px; width:694px" /></p> <p>Simple Visualization of Feature Selection</p> <p><strong>Benefits of performing feature selection:</strong></p> <ul> <li>Improved model performance &amp; reduced complexity (<em>Curse of Dimensionality</em>)</li> <li>Reduced training time</li> <li>Diminished risk of overfitting coming from uninformative &amp; redundant features</li> <li>Simplified deployment processes and live data pipelines, an often underestimated advantage.</li> </ul> <p>Despite its numerous benefits, feature selection can be overlooked during machine learning model development due to the tight deadlines in real-life projects and underestimation of its effects on performance. Also there are high number of feature selection methods and not knowing/deciding which one(s) to use may lead to the omission of this crucial step.</p> <p>A potential solution to overcome time related challenges and to leverage the positive impact of feature selection on machine learning models is the implementation of reusable, partially or fully automated code.</p> <p><a href="https://python.plainenglish.io/automated-feature-selection-for-machine-learning-in-python-2ad4bcfac19a">Website</a></p>