Removing Outliers. Understanding How and What behind the Magic.

An outlier is any piece of data that is at abnormal distance from other points in the dataset. To us humans looking at few values at guessing outliers is easy.

Take a look at this, Can you guess which are outliers?

[25, 26, 38, 34, 3, 33, 23, 85, 70, 28, 27]

Well my friend, here, 3, 70, 85 are outliers.

But consider this, as a Data Scientist, we might have to analyze hundreds of columns containing thousands or even millions of values. And you will immediately come to the conclusion that this method of guessing is just not feasible.

Click Here