Boosting Model Accuracy: Techniques I Learned During My Machine Learning Thesis at Spotify (+Code Snippets)

In 2021, I spent 8 months building a predictive model to measure user satisfaction as part of my Thesis at Spotify. <img alt="" src="https://miro.medium.com/v2/resize:fit:700/1*SUiVM45BO_U51aRFrOJ1pw.jpeg" style="height:549px; width:700px" /> Image by Author My goal was to understand what made users satisfied with their music experience. To do so, I built a LightGBM classifier whose output was a binary response: y = 1 → the user is seemingly satisfied y = 0 → not so much Predicting human satisfaction is a challenge because humans are by definition unsatisfied. Even a machine isn’t so fit to decipher the mysteries of the human psyche. So naturally my model was as confused as one can be. <h2>From Human Predictor to Fortune Teller</h2> My accuracy score was around 0.5, which is the worst possible outcome you can get on a classifier. It means the algorithm has a 50% chance of predicting yes or no, and that’s as random as a human guess. Website