Merging, Joining & Reshaping
Combine multiple datasets like a SQL Pro. Learn Merge, Join, Concat, Pivot, and Melt.
Combine multiple datasets like a SQL Pro. Learn Merge, Join, Concat, Pivot, and Melt. This hands-on tutorial focuses on practical implementation of merging, joining & reshaping concepts.
Module 7: Merging, Joining & Reshaping
Data often lives in multiple tables. Pandas gives you SQL-like powers to join them together and reshape the structure for analysis.
Lesson 15: Combining DataFrames
1. pd.concat()
Stacks DataFrames on top of each other (like appending rows).
df_combined = pd.concat([df1, df2])
2. pd.merge()
The most powerful join function. It links rows based on a common key (column).
how='inner': Only matches (Default).how='left': Keep all rows from left table.how='outer': Keep all rows from both tables.
Lesson 16: Reshaping Data
Sometimes you need to rotate your data from clear to wide format (Pivot) or wide to long (Melt).
1. pivot_table()
Excel users know this well. It aggregates data.
df.pivot_table(index='Date', columns='City', values='Temp', aggfunc='mean')
2. melt()
un-pivots data. Turns "wide" tables into "long" tables (better for analysis).
Practice: Data Architect
Challenge:
- Create a
customersDataFrame withIDandName. - Create a
purchasesDataFrame withCustomerIDandProduct. - Perform a Left Join to show ALL customers and their purchases (if any). Customers with no purchases should show
NaN.
Quiz
Question 1 of 5Which parameter controls the type of join (inner, left, outer)?
Key Takeaways
✅ pd.merge is your SQL JOIN equivalent.
✅ pivot_table is essential for summarizing data.
✅ concat is for sticking tables together vertically.