Python

Exploring & Understanding Data

Don't fly blind. Learn how to inspect your data using head, info, describe, and powerful indexing techniques.

By TechCoder TeamLast updated: 2026-06-02
In a Nutshell

Don't fly blind. Learn how to inspect your data using head, info, describe, and powerful indexing techniques. This hands-on tutorial focuses on practical implementation of exploring & understanding data concepts.

Module 3: Exploring & Understanding Data

Once you load data, the first step is always to look at it. Pandas provides powerful tools to summarize and inspect your dataset before you start processing it.


Lesson 5: Data Inspection

Rapid Overview

  • df.head(n): View the first n rows (default 5).
  • df.tail(n): View the last n rows.
  • df.sample(n): Pick n random rows (great for checking bias).
  • df.shape: Returns (rows, columns).

Structural Analysis

  • df.info(): Shows column names, non-null counts, and data types (Dtypes). Critical for finding missing data.
  • df.describe(): Shows statistics (mean, std, min, max) for numerical columns.
PYTHON PLAYGROUND
⏳ Loading editor…

Lesson 6: Indexing & Selection

Accessing specific data in Pandas is done via [], .loc[], and .iloc[].

1. Column Selection

  • df['ColumnName'] returns a Series.
  • df[['Col1', 'Col2']] returns a subset DataFrame.

2. .iloc[] (Integer Location)

Selects by position (like NumPy arrays).

  • df.iloc[0] (First row)
  • df.iloc[:5, 0] (First 5 rows, first column)

3. .loc[] (Label Location)

Selects by name/label.

  • df.loc[0, 'Score'] (Row with index 0, column 'Score')
  • df.loc[:, ['ID', 'Score']] (All rows, specific columns)
PYTHON PLAYGROUND
⏳ Loading editor…

Practice: Data Inspector

Challenge:

  1. Create a DataFrame with 10 rows of random numbers.
  2. Print the shape of the dataframe.
  3. Use .iloc to print the value in the last row and first column.

Quiz

Question 1 of 5

Which function gives you a statistical summary (mean, max, min) of your data?

df.info()
df.head()
df.describe()
df.stats()

Key Takeaways

df.info() is your first diagnostic toll.
df.describe() gives you instant statistical insight.
✅ Use .iloc for index-based access and .loc for label-based access.