gpt4 book ai didi

python - 用 df.where 替换迭代?

转载 作者:行者123 更新时间:2023-11-30 22:18:14 25 4
gpt4 key购买 nike

您好,我目前正在进行迭代,以便将一列的值与某个乘数相乘(如果它们与另一列中的特定值匹配)。为此,我已经有了一个功能迭代:

for index, row in street_cal.iterrows():
street_cal.loc[street_cal['street_typ'] == 'motorway', 'v_length'] = street_cal['cal_length'] * 130
street_cal.loc[street_cal['street_typ'] == 'motorway_link', 'v_length'] = street_cal['cal_length'] * 130
street_cal.loc[street_cal['street_typ'] == 'trunk', 'v_length'] = street_cal['cal_length'] * 80
street_cal.loc[street_cal['street_typ'] == 'trunk_link', 'v_length'] = street_cal['cal_length'] * 80
street_cal.loc[street_cal['street_typ'] == 'primary', 'v_length'] = street_cal['cal_length'] * 50
street_cal.loc[street_cal['street_typ'] == 'primary_link', 'v_length'] = street_cal['cal_length'] * 50
street_cal.loc[street_cal['street_typ'] == 'secondary', 'v_length'] = street_cal['cal_length'] * 50
street_cal.loc[street_cal['street_typ'] == 'secondary_link', 'v_length'] = street_cal['cal_length'] * 50
street_cal.loc[street_cal['street_typ'] == 'tertiary', 'v_length'] = street_cal['cal_length'] * 50
street_cal.loc[street_cal['street_typ'] == 'tertiary_link', 'v_length'] = street_cal['cal_length'] * 50
street_cal.loc[street_cal['street_typ'] == 'road', 'v_length'] = street_cal['cal_length'] * 50
street_cal.loc[street_cal['street_typ'] == 'unclassified', 'v_length'] = street_cal['cal_length'] * 50
street_cal.loc[street_cal['street_typ'] == 'residential', 'v_length'] = street_cal['cal_length'] * 30
street_cal.loc[street_cal['street_typ'] == 'living_street', 'v_length'] = street_cal['cal_length'] * 15

不幸的是,这个迭代需要相当长的时间,所以我尝试想出另一种方法来做到这一点,所以我发现了df.where

引用自https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.where.html :

“返回一个与 self 形状相同的对象,其对应条目来自 self,其中 cond 为 True,否则来自 other。[...]

其他:标量、NDFrame 或可调用

cond 为 False 的条目将替换为其他条目中的相应值。如果 other 可调用,则在 NDFrame 上计算并应返回标量或 NDFrame。可调用对象不得更改输入 NDFrame(尽管 pandas 不检查它)。

版本 0.18.1 中的新功能:可调用对象可以用作其他对象。”

据此,我认为我可以使用 df.where 执行与上面相同的操作,如下所示:

street_cal['v_length'] = None    

street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'motorway',
(street_cal['cal_length'] * v_mot), axis='index')
street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'motorway_link',
(street_cal['cal_length'] * v_mot), axis='index')
street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'trunk',
(street_cal['cal_length'] * v_tru), axis='index')
street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'trunk_link',
(street_cal['cal_length'] * v_tru), axis='index')
street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'primary',
(street_cal['cal_length'] * v_pri), axis='index')
street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'primary_link',
(street_cal['cal_length'] * v_pri), axis='index')
street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'secondary',
(street_cal['cal_length'] * v_sec), axis='index')
street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'secondary_link',
(street_cal['cal_length'] * v_sec), axis='index')
street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'tertiary',
(street_cal['cal_length'] * v_ter), axis='index')
street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'tertiary_link',
(street_cal['cal_length'] * v_ter), axis='index')
street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'road',
(street_cal['cal_length'] * v_roa), axis='index')
street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'unclassified',
(street_cal['cal_length'] * v_unc), axis='index')
street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'residential',
(street_cal['cal_length'] * v_res), axis='index')
street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'living_street',
(street_cal['cal_length'] * v_liv), axis='index')

但是,如果我运行代码,则只有带有“living_street”的行正确完成,而所有其他行在“v_length”列中包含的数字太高。我想对于其他人来说,值(value)会不止一次地倍增,这就是它们如此之高的原因。但我不明白为什么。在这种情况下,df.where 检查“street_typ”列是否具有例如'motorway' 未写入其中,因此 'street_typ' 列中具有 'motorway' 的行应写入 other 值,在本例中为 (street_cal['cal_length' ] * v_mot),对吧?我想我对 df.where 的工作原理有点困惑。

最佳答案

这是另一个建议;创建缩放贴图并使用 pd.Series.map/replace 应用它。

scaler = { 'motorway' : 130, 'motorway_link' : 130, ... }    
street_cal['v_length'] = (
street_cal['cal_length'] * street_cal['street_typ'].map(scaler).fillna(1)
)

关于python - 用 df.where 替换迭代?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49433222/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com