Boosting Model Accuracy: Techniques I Learned During My Machine Learning Thesis at Spotify (+Code Snippets)
<p><em>This article is one of a two-part piece documenting my learnings from my Machine Learning Thesis at Spotify. Be sure to also check out </em><a href="https://towardsdatascience.com/feature-importance-analysis-with-shap-i-learned-at-spotify-aacd769831b4" rel="noopener" target="_blank"><em>the second article on how I implemented Feature Importance in this research</em></a><em>.</em></p>
<h2><a href="https://towardsdatascience.com/feature-importance-analysis-with-shap-i-learned-at-spotify-aacd769831b4?source=post_page-----8027f9c11e57--------------------------------" rel="noopener follow" target="_blank">Feature Importance Analysis with SHAP I Learned at Spotify (with the Help of the Avengers)</a></h2>
<h3><a href="https://towardsdatascience.com/feature-importance-analysis-with-shap-i-learned-at-spotify-aacd769831b4?source=post_page-----8027f9c11e57--------------------------------" rel="noopener follow" target="_blank">Identifying top features and understanding how they affect prediction outcomes of machine learning models with SHAP</a></h3>
<p><a href="https://towardsdatascience.com/feature-importance-analysis-with-shap-i-learned-at-spotify-aacd769831b4?source=post_page-----8027f9c11e57--------------------------------" rel="noopener follow" target="_blank">towardsdatascience.com</a></p>
<p>In 2021, I spent 8 months building a predictive model to measure <em>user satisfaction</em> as part of my Thesis at Spotify.</p>
<p><img alt="" src="https://miro.medium.com/v2/resize:fit:700/1*SUiVM45BO_U51aRFrOJ1pw.jpeg" style="height:549px; width:700px" /></p>
<p>Image by Author</p>
<p>My goal was to understand what made users satisfied with their music experience. To do so, I built a LightGBM classifier whose output was a binary response:<br />
<em>y = 1 → the user is seemingly satisfied<br />
y = 0 → not so much</em></p>
<p>Predicting human satisfaction is a challenge because humans are by definition unsatisfied. Even a machine isn’t so fit to decipher the mysteries of the human psyche. So naturally my model was as confused as one can be.</p>
<h2>From Human Predictor to Fortune Teller</h2>
<p>My accuracy score was around 0.5, which is the worst possible outcome you can get on a classifier. It means the algorithm has a 50% chance of predicting yes or no, and that’s as random as a human guess.</p>
<p><a href="https://towardsdatascience.com/boosting-model-accuracy-techniques-i-learned-during-my-machine-learning-thesis-at-spotify-code-8027f9c11e57"><strong>Website</strong></a></p>