World's Best AI Learning Platform with profoundly Demanding Certification Programs
Designed by IITians, only for AI Learners.
Designed by IITians, only for AI Learners.
New to InsideAIML? Create an account
Employer? Create an account
How to create a new column in a data frame based on multiple conditions/other variables in Python?
To create a new column in a data frame based on multiple conditions or other variables in Python, you can use the following steps:
import pandas as pd data = pd.read_csv('data.csv')
2. Define the conditions that you want to use to create the new column. For example, let's say you want to create a new column called "Category" based on the values in the "Score" column:
conditions = [ (data['Score'] >= 90), (data['Score'] >= 80) & (data['Score'] < 90), (data['Score'] >= 70) & (data['Score'] < 80), (data['Score'] >= 60) & (data['Score'] < 70), (data['Score'] < 60) ]
3. Define the values that you want to assign to the new column based on the conditions. For example, you can assign the values "A", "B", "C", "D", and "F" based on the conditions:
values = ['A', 'B', 'C', 'D', 'F']
4. Use the numpy select() function to create the new column based on the conditions and values:
import numpy as np data['Category'] = np.select(conditions, values)
5. Finally, you can view the new column in the data frame:
print(data)
This will create a new column called "Category" in the data frame based on the values in the "Score" column and the defined conditions and values.