Filtering, Grouping & Aggregation
Unlock the power of 'Group By'. Learn to filter complex datasets and perform powerful aggregations like SQL.
Unlock the power of 'Group By'. Learn to filter complex datasets and perform powerful aggregations like SQL. This hands-on tutorial focuses on practical implementation of filtering, grouping & aggregation concepts.
Module 6: Filtering, Grouping & Aggregation
This is where Pandas shines. You can effortlessly slice millions of rows based on complex conditions and summarize them with groupby—just like SQL but easier.
Lesson 13: Conditional Filtering
Filtering rows based on values.
Boolean Selection
# Select rows where Age > 25
adults = df[df['Age'] > 25]
# Multiple conditions (Use parentheses & bitwise operators!)
# AND: &, OR: |
target = df[(df['Age'] > 25) & (df['City'] == 'Paris')]
The .isin() Method
Great for checking against a list.
df[df['Status'].isin(['Active', 'Pending'])]
Lesson 14: GroupBy Deep Dive
groupby() involves three steps: Split, Apply, Combine.
- Split data into groups based on criteria.
- Apply a function (sum, mean, count) to each group.
- Combine the results.
# Total sales per region
sales_by_region = df.groupby('Region')['Sales'].sum()
Multiple Aggregations: .agg()
You can calculate multiple stats at once!
df.groupby('Region')['Sales'].agg(['sum', 'mean', 'max'])
Mini Project: Sales Analyst
Challenge:
- Create a DataFrame having
Category('Electronics', 'Clothing', 'Electronics'), andRevenue(1000, 500, 1200). - Filter to show only 'Electronics'.
- Use
groupbyto calculate the total Revenue for each Category.
Quiz
Question 1 of 5Which syntax is correct for filtering with multiple conditions?
Key Takeaways
✅ Boolean masking (&, |) is the standard way to filter.
✅ groupby() is powerful: think "Split-Apply-Combine".
✅ Use .agg() to get multiple summary statistics at once.