Visualizing 3 Sklearn Cross-validation: K-Fold, Shuffle & Split, and Time Series Split
<p>Basically, cross-validation is a statistical method for evaluating learning algorithms. A fixed number of folds (groups of data) is set to run the analysis. These folds group the data into 2 sets: training and testing (validation) sets, that are cross-over in rounds, allowing each data point to be validated.</p>
<p>The main purpose is to test the model's ability to predict independent data that was not used in creating it. It is also useful to cope with problems like <a href="https://en.wikipedia.org/wiki/Overfitting" rel="noopener ugc nofollow" target="_blank">overfitting</a> or <a href="https://en.wikipedia.org/wiki/Selection_bias" rel="noopener ugc nofollow" target="_blank">selection bias</a>.</p>
<p><img alt="" src="https://miro.medium.com/v2/resize:fit:525/1*pRe5vFBSKvTcaH0bk9Fp8A.png" style="height:391px; width:700px" /></p>
<p>An example of results from cross-validations in this article. Image by Author.</p>
<p>In this article, we are going to apply Python to visualize the process of 3 cross-validation types from the <a href="https://scikit-learn.org/stable/modules/cross_validation.html" rel="noopener ugc nofollow" target="_blank">Scikit Learn</a> library:</p>
<ul>
<li>K-Fold cross-validation</li>
<li>Shuffle & Split cross-validation</li>
<li>Time Series Split cross-validation</li>
</ul>
<p>Moreover, the validation results can also be plotted to express insightful information.</p>
<p><a href="https://towardsdatascience.com/visualizing-sklearn-cross-validation-k-fold-shuffle-split-and-time-series-split-a13221eb5a56">Read More</a></p>