Boosting Model Accuracy Techniques I Learned During My Machine Learning Thesis at Spotify (+Code???

In 2021, I spent 8 months building a predictive model to measure user satisfaction as part of my Thesis at Spotify.

Image by Author

My goal was to understand what made users satisfied with their music experience. To do so, I built a LightGBM classifier whose output was a binary response:
y = 1 → the user is seemingly satisfied
y = 0 → not so much

Predicting human satisfaction is a challenge because humans are by definition unsatisfied. Even a machine isn’t so fit to decipher the mysteries of the human psyche. So naturally my model was as confused as one can be.

From Human Predictor to Fortune Teller

My accuracy score was around 0.5, which is the worst possible outcome you can get on a classifier. It means the algorithm has a 50% chance of predicting yes or no, and that’s as random as a human guess.

So I spent 2 months trying and combining different techniques to improve the prediction of my model. In the end, I was finally able to improve my ROC score from 0.5 to 0.73, which was a big success!

Read More