Friday, January 17, 2025
HomeTechHow do I select rows from a DataFrame based on column

How do I select rows from a DataFrame based on column

In pandas, you can select rows from a DataFrame based on a specific condition in one of the columns using boolean indexing. Here’s how you can do that:

Syntax:

df[df['column_name'] condition]

Example 1: Select rows where a column’s value is equal to a specific value

import pandas as pd

# Sample DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [23, 35, 45, 30],
    'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']
}

df = pd.DataFrame(data)

# Select rows where Age is greater than 30
filtered_df = df[df['Age'] > 30]

print(filtered_df)

Output:

    Name  Age         City
1    Bob   35  Los Angeles
2  Charlie   45      Chicago

Example 2: Select rows where a column’s value is equal to a specific string

# Select rows where the City is 'Chicago'
filtered_df = df[df['City'] == 'Chicago']

print(filtered_df)

Output:

    Name  Age     City
2  Charlie   45  Chicago

Example 3: Select rows based on multiple conditions

You can also combine multiple conditions using logical operators (& for AND, | for OR).

# Select rows where Age is greater than 30 and City is 'Chicago'
filtered_df = df[(df['Age'] > 30) & (df['City'] == 'Chicago')]

print(filtered_df)

Output:

    Name  Age     City
2  Charlie   45  Chicago

Example 4: Using isin for multiple values

You can use the isin() method to filter rows based on whether a column’s value is in a list of values.

# Select rows where the City is either 'New York' or 'Houston'
filtered_df = df[df['City'].isin(['New York', 'Houston'])]

print(filtered_df)

Output:

    Name  Age         City
0  Alice   23     New York
3  David   30     Houston

Example 5: Select rows based on string matching

If you want to filter rows based on string matching (e.g., partial string), you can use str.contains():

# Select rows where the 'City' contains the substring 'New'
filtered_df = df[df['City'].str.contains('New')]

print(filtered_df)

Output:

    Name  Age     City
0  Alice   23  New York

Conclusion:

  • You can use boolean indexing to filter rows in a pandas DataFrame based on conditions in one or more columns.
  • Logical operators (&, |) allow combining multiple conditions.
  • Functions like isin() and str.contains() make filtering even more flexible for specific use cases like checking for membership or substring matches.
RELATED ARTICLES
0 0 votes
Article Rating

Leave a Reply

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
- Advertisment -

Most Popular

Recent Comments

0
Would love your thoughts, please comment.x
()
x