A Data Scientist’s Essential Guide to Exploratory Data Analysis

<p>Exploratory Data Analysis (EDA) is the single most important task to conduct at the beginning of every data science project.</p> <p>In essence, it involves thoroughly examining and characterizing your data in order to find its underlying&nbsp;<strong>characteristics</strong>, possible&nbsp;<strong>anomalies</strong>, and hidden&nbsp;<strong>patterns</strong>&nbsp;and&nbsp;<strong>relationships</strong>.</p> <p>This understanding of your data is what will ultimately&nbsp;<strong>guide through the following steps</strong>&nbsp;of you machine learning pipeline, from data preprocessing to model building and analysis of results.</p> <h2>The process of EDA fundamentally comprises three main tasks:</h2> <ul> <li><strong>Step 1:</strong>&nbsp;<em>Dataset Overview and Descriptive Statistics</em></li> <li><strong>Step 2:</strong>&nbsp;<em>Feature Assessment and Visualization</em>, and</li> <li><strong>Step 3:</strong>&nbsp;<em>Data Quality Evaluation</em></li> </ul> <p>As you may have guessed, each of these tasks may entail a quite comprehensive amount of analyses, which will easily have you<em>&nbsp;slicing, printing, and plotting your pandas dataframes like a madman.</em></p> <p><a href="https://towardsdatascience.com/a-data-scientists-essential-guide-to-exploratory-data-analysis-25637eee0cf6">Read More</a></p>