15 Essential Python Pandas Code Snippets for Data Scientists

Python’s Pandas library is a fundamental tool for data scientists, offering powerful data manipulation and analysis capabilities. In this article, we’ll explore 15 advanced Pandas code snippets that every data scientist should have in their toolkit. These snippets will help you streamline your data analysis tasks and extract valuable insights from your datasets.

1. Filtering Data

import pandas as pd

# Create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 40]}

df = pd.DataFrame(data)

# Filter rows where Age is greater than 30
filtered_df = df[df['Age'] > 30]
print(filtered_df)

2. Grouping and Aggregating Data

# Grouping by a column and calculating the mean
grouped = df.groupby('Age').mean()
print(grouped)

3. Handling Missing Data

# Check for missing values
missing_values = df.isnull().sum()

# Fill missing values with a specific value
df['Age'].fillna(0, inplace=True)

4. Applying Functions to Columns

# Applying a custom function to a column
df['Age'] = df['Age'].apply(lambda x: x * 2)

5. Concatenating DataFrames

# Concatenate two DataFrames
df1 = pd.DataFrame({'A': ['A0', 'A1'], 'B': ['B0', 'B1']})
df2 = pd.DataFrame({'A': ['A2', 'A3'], 'B': ['B2', 'B3']})

result = pd.concat([df1, df2], ignore_index=True)
print(result)

6. Merging DataFrames

# Merge two DataFrames
left = pd.DataFrame({'key': ['A', 'B', 'C'], 'value': [1, 2, 3]})
right = pd.DataFrame({'key': ['B', 'C', 'D'], 'value': [4, 5, 6]})

merged = pd.merge(left, right, on='key', how='inner')
print(merged)

7. Pivot Tables

# Creating a pivot table
pivot_table = df.pivot_table(index='Name', columns='Age', values='Value')
print(pivot_table)

8. Handling DateTime Data

# Converting a column to DateTime
df['Date'] = pd.to_datetime(df['Date'])

9. Reshaping Data

# Melting a DataFrame
melted_df = pd.melt(df, id_vars=['Name'], value_vars=['A', 'B'])
print(melted_df)

10. Working with Categorical Data

# Encoding categorical variables
df['Category'] = df['Category'].astype('category')
df['Category'] = df['Category'].cat.codes

Visit Now