A Comprehensive Guide to Applying Filters on Pandas DataFrame​

Applying Filters on Pandas DataFrame becomes crucial for all the data manipulations and transformations.

In the vast landscape of data analysis, the ability to filter and extract specific information from a dataset is paramount. Pandas, the go-to data manipulation library in Python, provides powerful tools for filtering data within a DataFrame. In this guide, we’ll explore various methods to apply filters on a Pandas DataFrame, empowering you to wield this skill with precision in your data analysis endeavors.

Pandas DataFrame - Understanding the Basics

A Pandas DataFrame is a two-dimensional tabular data structure, akin to a spreadsheet. Filtering, in this context, refers to the process of selecting a subset of rows or columns based on certain conditions. This allows you to focus on the data that is relevant to your analysis.

Applying Filters on Pandas DataFrame - Filtering Rows Based on Conditions

Let’s go through few use cases that will give you better understanding on filtering Pandas DataFrame rows based on conditions.

Use Case I - Pandas DataFrame Filtering Rows Based on Conditions

In this example, we have a DataFrame of student names with their ages and grades.

import pandas as pd

data = {'Name': ['Bob', 'Harry', 'Ellie', 'David'],
     'Age': [25, 30, 22, 28],
     'Grade': [85, 92, 78, 95]}

df = pd.DataFrame(data)
How to filter pandas dataframe based on conditions

Use Case 2 - Pandas DataFrame Filtering Rows Based on Multiple Conditions

Now, if we want the rows where the grade is more than 90; below chunk of code should be able t perform that operation.

high_scorers = df[df['Grade'] > 90]

Now, if we want the rows where grade is more than 90 and the age is more than 25.

filtered_students = df[(df['Grade'] > 90) & (df['Age'] > 25)]
How to filter pandas dataframe based on conditions

Use Case 2 - Pandas DataFrame Filtering Rows using isin method

If you want to select among multiple vales together, below process can be followed – 

selected_students = df[df['Name'].isin(['Bob', 'David']
)]

For a larger list of values you can assign the list first and then use that list within isin() to apply. 

names = ['Bob', 'David']
selected_students = df[df['Name'].isin(names)]
How to filter pandas dataframe based on conditions

Use Case 3 - Pandas DataFrame Filtering Rows using Negating method

If you want to filter the data such a way that you want to include everything except few values, the negate (~) method becomes handy. The negate (~) also works for almost all the conditions if you want to take the opposite of that. Below is just a use case – 

selected_students = df[~df['Name'].isin(['Bob', 'David']
)]

For a larger list of values you can assign the list first and then use that list within isin() to apply.

names = ['Bob', 'David']
selected_students = df[~df['Name'].isin(names)]
How to filter pandas dataframe based on conditions

2 thoughts on “A Comprehensive Guide to Applying Filters on Pandas DataFrame​”

  1. Pingback: How to drop columns in a Pandas DataFrame - Data Analytics Edu

  2. Pingback: How to rename columns in pandas DataFrame - Data Analytics Edu

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top