Building a Random Forest by Hand in Python

From <a href="https://www.sciencedirect.com/science/article/abs/pii/S0957417416306819" rel="noopener ugc nofollow" target="_blank">drug discovery</a> to <a href="https://www.mdpi.com/2072-4292/4/9/2661" rel="noopener ugc nofollow" target="_blank">species classification</a>, <a href="https://journals.sagepub.com/doi/abs/10.1177/2278533718765531" rel="noopener ugc nofollow" target="_blank">credit scoring</a> to <a href="https://www.sciencedirect.com/science/article/pii/S1877050916311127" rel="noopener ugc nofollow" target="_blank">cybersecurity</a> and more, the random forest is a popular and powerful algorithm for modeling our complex world. Its versatility and predictive prowess would seem to require cutting-edge complexity, but if we dig into what a random forest actually is, we see a shockingly simple set of repeating steps. I find that the best way to learn something is to play with it. So to gain an intuition on how random forests work, let’s build one by hand in Python, starting with a decision tree and expanding to the full forest. We’ll see first-hand how flexible and interpretable this algorithm is for both classification and regression. And while this project may sound complicated, there are really only a few core concepts we’ll need to learn: 1) how to iteratively partition data, and 2) how to quantify how well data is partitioned. <a href="https://towardsdatascience.com/building-a-random-forest-by-hand-in-python-187ac0620875">Website</a>