Understanding Entropy made me a better data scientist
<p>I remember several years ago when I was reshaping my career from finance into data science and being fascinated about how the book <a href="https://www.amazon.es/Data-Science-Business-Data-Analytic-Thinking-ebook/dp/B00E6EQ3X4/ref=sr_1_1?__mk_es_ES=%C3%85M%C3%85%C5%BD%C3%95%C3%91&crid=3T07GVD9ETC2W&keywords=data+science+for+business&qid=1688363602&sprefix=data+science+for+business%2Caps%2C93&sr=8-1" rel="noopener ugc nofollow" target="_blank"><em>Data Science for Business (Provost & Fawcett)</em></a> introduced the concept of <strong><em>Entropy </em></strong>in their classification examples, so elegantly, so powerful yet so simple. What they were explaining was nothing new to me, I had learned about machine learning and data science way before reading that book, yet that specific approach changed my whole interpretation of the subject. I always thought it was something truly beautiful to write about, hence this small article! Let’s do it!</p>
<p>Entropy is a crucial concept in data science.</p>
<ul>
<li>It <strong>quantifies data uncertainty</strong>, helping understanding how much additional information is required for more accurate predictions.</li>
<li>It <strong>guides feature selection</strong> in machine learning, determining the <em>information gain</em> of each feature and thus their importance.</li>
<li>It is fundamental in decision tree algorithms, shaping the tree structure by <strong>prioritizing the most informative features</strong>.</li>
<li>It <strong>measures dataset impurity</strong>, reflecting the degree of mixedness of data classes.</li>
<li>By assisting effective feature selection, it <strong>helps prevent overfitting</strong>, leading to more robust and effective models.</li>
</ul>
<p>If you’re more of a visual person and would prefer to watch some YouTube videos, I’d be happy to refer you to two of my favorite about Entropy, each one of them from a very different angle:</p>
<p><a href="https://medium.com/@gabrielpierobon/understanding-entropy-made-me-a-better-data-scientist-3196a3ff6ab4"><strong>Website</strong></a></p>