Simplify Your Data Preparation With These 4 Lesser-Known Scikit-Learn Classes
<p>Data preparation is famously the least-loved aspect of Data Science. If done right, however, it needn’t be such a headache.</p>
<p>While scikit-learn has fallen out of vogue as a <em>modelling</em> library in recent years given the meteoric rise of PyTorch, LightGBM, and XGBoost, it’s still easily one of the best <em>data preparation </em>libraries out there.</p>
<p>And I’m not just talking about that old chestnut: <code>train_test_split</code>. If you’re prepared to dig a little deeper, you’ll find a treasure trove of helpful tools for more advanced data preparation techniques, all of which are perfectly compatible with using other libraries like <code>lightgbm</code>, <code>xgboost</code> and <code>catboost</code> for subsequent modelling.</p>
<p>In this article, I’ll walk through four scikit-learn classes which significantly speed up my data preparation workflows in my day-to-day job as a Data Scientist.</p>
<p><a href="https://towardsdatascience.com/simplify-your-data-preparation-with-these-4-lesser-known-scikit-learn-classes-70270c94569f">Read More</a></p>