gpt4 book ai didi

python - Pandas 数据框中的多个 if else 条件并派生多列

转载 作者:太空狗 更新时间:2023-10-30 00:23:54 26 4
gpt4 key购买 nike

我有一个如下所示的数据框。

import pandas as pd
import numpy as np
raw_data = {'student':['A','B','C','D','E'],
'score': [100, 96, 80, 105,156],
'height': [7, 4,9,5,3],
'trigger1' : [84,95,15,78,16],
'trigger2' : [99,110,30,93,31],
'trigger3' : [114,125,45,108,46]}

df2 = pd.DataFrame(raw_data, columns = ['student','score', 'height','trigger1','trigger2','trigger3'])

print(df2)

我需要根据多个条件导出 Flag 列。

我需要将分数和高度列与触发器 1 -3 列进行比较。

标记栏:

  1. 如果 Score 大于等于触发 1 且高度小于 8 则红色 --

  2. 如果 Score 大于等于触发 2 且高度小于 8 则黄色 --

  3. 如果 Score 大于等于 trigger 3 并且 height 小于 8 那么 Orange --

  4. 如果高度大于8则留空

如何在 pandas dataframe 中编写 if else 条件并派生列?

预期输出

  student  score  height  trigger1  trigger2  trigger3    Flag
0 A 100 7 84 99 114 Yellow
1 B 96 4 95 110 125 Red
2 C 80 9 15 30 45 NaN
3 D 105 5 78 93 108 Yellow
4 E 156 3 16 31 46 Orange

对于我原来问题中的其他列 Text1,我已经尝试过这个,但是整数列在使用 astype(str) 任何其他方法连接时不转换字符串?

def text_df(df):

if (df['trigger1'] <= df['score'] < df['trigger2']) and (df['height'] < 8):
return df['student'] + " score " + df['score'].astype(str) + " greater than " + df['trigger1'].astype(str) + " and less than height 5"
elif (df['trigger2'] <= df['score'] < df['trigger3']) and (df['height'] < 8):
return df['student'] + " score " + df['score'].astype(str) + " greater than " + df['trigger2'].astype(str) + " and less than height 5"
elif (df['trigger3'] <= df['score']) and (df['height'] < 8):
return df['student'] + " score " + df['score'].astype(str) + " greater than " + df['trigger3'].astype(str) + " and less than height 5"
elif (df['height'] > 8):
return np.nan

最佳答案

您需要使用上限和下限进行链式比较

def flag_df(df):

if (df['trigger1'] <= df['score'] < df['trigger2']) and (df['height'] < 8):
return 'Red'
elif (df['trigger2'] <= df['score'] < df['trigger3']) and (df['height'] < 8):
return 'Yellow'
elif (df['trigger3'] <= df['score']) and (df['height'] < 8):
return 'Orange'
elif (df['height'] > 8):
return np.nan

df2['Flag'] = df2.apply(flag_df, axis = 1)

student score height trigger1 trigger2 trigger3 Flag
0 A 100 7 84 99 114 Yellow
1 B 96 4 95 110 125 Red
2 C 80 9 15 30 45 NaN
3 D 105 5 78 93 108 Yellow
4 E 156 3 16 31 46 Orange

注意:您可以使用非常嵌套的 np.where 来执行此操作,但我更喜欢为多个 if-else 应用一个函数

编辑:回答@Cecilia 的问题

  1. 返回的对象不是字符串而是一些计算,比如第一个条件,我们要返回df['height']*2

不确定您尝试了什么,但您可以使用返回派生值而不是字符串

def flag_df(df):

if (df['trigger1'] <= df['score'] < df['trigger2']) and (df['height'] < 8):
return df['height']*2
elif (df['trigger2'] <= df['score'] < df['trigger3']) and (df['height'] < 8):
return df['height']*3
elif (df['trigger3'] <= df['score']) and (df['height'] < 8):
return df['height']*4
elif (df['height'] > 8):
return np.nan
  1. 如果 osome 列中有“NaN”值并且我想使用 df['xxx'] is None 作为条件怎么办,代码似乎无法正常工作

再次不确定您尝试了什么代码,但使用 pandas isnull 可以解决问题

def flag_df(df):

if pd.isnull(df['height']):
return df['height']
elif (df['trigger1'] <= df['score'] < df['trigger2']) and (df['height'] < 8):
return df['height']*2
elif (df['trigger2'] <= df['score'] < df['trigger3']) and (df['height'] < 8):
return df['height']*3
elif (df['trigger3'] <= df['score']) and (df['height'] < 8):
return df['height']*4
elif (df['height'] > 8):
return np.nan

关于python - Pandas 数据框中的多个 if else 条件并派生多列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48569166/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com