In Pandas, you can create a DataFrame by appending one row at a time using the append
method or by constructing a list of dictionaries (which is more efficient). Here’s how you can do it:
Method 1: Using append
(Not Recommended for Large Data)
You can start with an empty DataFrame and add rows one by one using the append
method.
import pandas as pd
# Create an empty DataFrame with specific columns
df = pd.DataFrame(columns=["Name", "Age", "City"])
# Append rows one by one
df = df.append({"Name": "Alice", "Age": 25, "City": "New York"}, ignore_index=True)
df = df.append({"Name": "Bob", "Age": 30, "City": "San Francisco"}, ignore_index=True)
print(df)
Output:
Name Age City
0 Alice 25 New York
1 Bob 30 San Francisco
Method 2: Using a List of Dictionaries (More Efficient)
Appending rows using append
repeatedly is inefficient for large datasets because a new DataFrame is created every time you call append
. Instead, you can use a list to store rows and create a DataFrame in one go:
import pandas as pd
# Create a list to hold row data
rows = []
# Add rows to the list
rows.append({"Name": "Alice", "Age": 25, "City": "New York"})
rows.append({"Name": "Bob", "Age": 30, "City": "San Francisco"})
# Create a DataFrame from the list
df = pd.DataFrame(rows)
print(df)
Output:
Name Age City
0 Alice 25 New York
1 Bob 30 San Francisco
Method 3: Using loc
to Add Rows by Index
If you know the index for each new row, you can use loc
to add rows directly:
import pandas as pd
# Create an empty DataFrame
df = pd.DataFrame(columns=["Name", "Age", "City"])
# Add rows using loc
df.loc[0] = ["Alice", 25, "New York"] # Add the first row
df.loc[1] = ["Bob", 30, "San Francisco"] # Add the second row
print(df)
Output:
Name Age City
0 Alice 25 New York
1 Bob 30 San Francisco
Which Method to Use?
- Method 2 (List of Dictionaries): Recommended for better performance when appending multiple rows.
- Method 3 (
loc
): Useful when the index is already known or needs to be specified. - Method 1 (
append
): Easy for quick tasks, but avoid it in loops or large-scale operations as it is slower.
Let me know if you’d like more details or examples! 😊