Exploring & Understanding Data
Don't fly blind. Learn how to inspect your data using head, info, describe, and powerful indexing techniques.
Don't fly blind. Learn how to inspect your data using head, info, describe, and powerful indexing techniques. This hands-on tutorial focuses on practical implementation of exploring & understanding data concepts.
Module 3: Exploring & Understanding Data
Once you load data, the first step is always to look at it. Pandas provides powerful tools to summarize and inspect your dataset before you start processing it.
Lesson 5: Data Inspection
Rapid Overview
df.head(n): View the firstnrows (default 5).df.tail(n): View the lastnrows.df.sample(n): Picknrandom rows (great for checking bias).df.shape: Returns (rows, columns).
Structural Analysis
df.info(): Shows column names, non-null counts, and data types (Dtypes). Critical for finding missing data.df.describe(): Shows statistics (mean, std, min, max) for numerical columns.
Lesson 6: Indexing & Selection
Accessing specific data in Pandas is done via [], .loc[], and .iloc[].
1. Column Selection
df['ColumnName']returns a Series.df[['Col1', 'Col2']]returns a subset DataFrame.
2. .iloc[] (Integer Location)
Selects by position (like NumPy arrays).
df.iloc[0](First row)df.iloc[:5, 0](First 5 rows, first column)
3. .loc[] (Label Location)
Selects by name/label.
df.loc[0, 'Score'](Row with index 0, column 'Score')df.loc[:, ['ID', 'Score']](All rows, specific columns)
Practice: Data Inspector
Challenge:
- Create a DataFrame with 10 rows of random numbers.
- Print the shape of the dataframe.
- Use
.ilocto print the value in the last row and first column.
Quiz
Question 1 of 5Which function gives you a statistical summary (mean, max, min) of your data?
Key Takeaways
✅ df.info() is your first diagnostic toll.
✅ df.describe() gives you instant statistical insight.
✅ Use .iloc for index-based access and .loc for label-based access.