Your Features Are Important? It Doesn’t Mean They Are Good

The concept of “feature importance” is widely used in machine learning as the most basic type of model explainability. For example, it is used in Recursive Feature Elimination (RFE), to iteratively drop the least important feature of the model. However, there is a misconception about it. <blockquote> The fact that a feature is important doesn’t imply that it is beneficial for the model! </blockquote> Indeed, when we say that a feature is important, this simply means that the feature brings a high contribution to the predictions made by the model. But we should consider that such contribution may be wrong. Take a simple example: a data scientist accidentally forgets the Customer ID between its model’s features. The model uses Customer ID as a highly predictive feature. As a consequence, this feature will have a high feature importance even if it is actually worsening the model, because it cannot work well on unseen data. To make things clearer, we will need to make a distinction between two concepts: <ul> <li>Prediction Contribution: what part of the predictions is due to the feature; this is equivalent to feature importance.</li> <li>Error Contribution: what part of the prediction errors is due to the presence of the feature in the model.</li> </ul> In this article, we will see how to calculate these quantities and how to use them to get valuable insights about a predictive model (and to improve it). <a href="https://towardsdatascience.com/your-features-are-important-it-doesnt-mean-they-are-good-ff468ae2e3d4">Website</a>