Data Manipulation
Master the art of transforming data. Sort, rank, and apply custom functions to your datasets.
Master the art of transforming data. Sort, rank, and apply custom functions to your datasets. This hands-on tutorial focuses on practical implementation of data manipulation concepts.
Module 5: Data Manipulation
Cleaning is just the start. "Manipulation" means reshaping, transforming, and extracting new value from your data using custom logic.
Lesson 10: Sorting & Ranking
Sorting
df.sort_values(by="col"): Sorts by a column.ascending=Falsefor descending order.df.sort_index(): Sorts by the index labels.
Ranking
df.rank(): Assigns a rank (1, 2, 3...) to values. Useful for "Top N" analysis.
Lesson 11: Applying Functions
The most powerful tool for custom transformations.
1. apply()
Applies a custom function to every Row or Column.
df.apply(func, axis=0): Down columns (default).df.apply(func, axis=1): Across rows.
2. map()
Used on Series to map values to new ones (like a VLOOKUP).
df['GenderNum'] = df['Gender'].map({'Male': 0, 'Female': 1})
Lesson 12: String Manipulation
Pandas has a .str accessor that gives you all standard string methods (upper, lower, replace, split) on an entire column at once!
df['Name'].str.upper()df['Phone'].str.replace("-", "")df['Email'].str.contains("@gmail.com")
Practice: Data Transformer
Challenge:
- Create a DataFrame with names:
["Alice", "BOB", "CharLie"]. - Use
.strto convert them all to Title Case (e.g., "Alice", "Bob", "Charlie"). - Create a new column "Length" that contains the length of each name (using
applyor.str.len()).
Quiz
Question 1 of 5Which parameter should you set to sort values from highest to lowest?
Key Takeaways
✅ sort_values organizes your data.
✅ apply lets you run Python logic on DataFrame rows.
✅ Always use .str for text columns; do not loop over strings!