gpt4 book ai didi

python - 基于另一个列值创建列,基于将值分配给输入列中的字符串值集

转载 作者:行者123 更新时间:2023-12-01 08:21:43 24 4
gpt4 key购买 nike

我的问题似乎必须有一个简单的解决方案,但我无法解决它。我试过.loc , np.wheredf.apply

#input          
datetime dty dtx status
2018-09-16 04:38:17 0.0 0.099854 F-On
2018-09-16 04:38:18 0.0 0.100098 F-On
2018-09-16 04:38:19 0.0 0.000000 S-On
2018-09-16 04:38:20 0.0 0.100098 F-On
2018-09-16 04:38:21 0.0 0.100098 circ
2018-09-16 04:38:22 0.0 0.100098 circInS
2018-09-16 04:38:21 0.0 0.100098 TH
2018-09-16 04:38:21 0.0 0.100098 R
2018-09-16 04:38:21 0.0 0.100098 S

域中存在“映射” -

    (F-On,S-On) becomes 'On'
(circ,TH,circInS) becomes 'fooON'
(R) stays 'R'
(S) stays 'S'

#expected ouput
datetime dty dtx status grouped_status
2018-09-16 04:38:17 0.0 0.099854 F-On On
2018-09-16 04:38:18 0.0 0.100098 F-On On
2018-09-16 04:38:19 0.0 0.000000 S-On On
2018-09-16 04:38:20 0.0 0.100098 F-On On
2018-09-16 04:38:21 0.0 0.100098 circ fooON
2018-09-16 04:38:22 0.0 0.100098 circInS fooON
2018-09-16 04:38:21 0.0 0.100098 TH fooON
2018-09-16 04:38:21 0.0 0.100098 R R
2018-09-16 04:38:21 0.0 0.100098 S S

The truth value of a Series is ambiguous. Use a.empty, a.bool(),
a.item(), a.any() or a.all().

I understand the code below is comparing an array to a single value ;这是不明确的,因此它失败了。为了按行进行比较,我尝试使用 df.apply ,但它没有给出所需的输出。

如果可能的话,如何使下面的所有三种方法都起作用,哪种方法是按行操作的最佳方法?

#using np.where
df['grouped_status'] = np.where(df['status'] in ('circ','TH','circInS'), 'fooON', df['status'])

#using df.loc
df.loc[df['status'] in ('circ','TH','circInS'),['status']] = 'fooON'
df['grouped_status'] = df['status']

#function for df.apply
def group_status_fn (row):

val = ""

if row['grouped_status'] in ('F-On','B-On','S-On'):
row['grouped_status'] = 'On'
elif row['grouped_status'] in (circ,TH,circInS):
row['grouped_status'] = fooON

elif row['grouped_status'] == 'R':
val = 'R'
elif row['grouped_status'] == 'S':
val = 'S'

return val

#using df.apply
df["grouped_status2"]=df.apply(group_status_fn, axis = 1)

#out - output column half empty
datetime dHD status grouped_status grouped_status2

2018-09-16 04:38:35 0.000000 F-On F-On
2018-09-16 04:38:36 0.000000 F-On F-On
2018-09-16 04:38:37 0.000000 S-On S-On
2018-09-16 04:38:38 0.000000 S-On S-On
2018-09-16 04:38:39 0.000000 R R R
2018-09-16 04:38:40 0.099854 R R R
2018-09-16 04:38:41 0.100098 R R R
2018-09-16 04:38:42 0.000000 R R R
2018-09-16 04:38:43 0.000000 R R R

最佳答案

使用map :

lookup = {'F-On' : 'On', 'S-On' : 'On', 'circ':'fooON', 'TH':'fooON', 'circInS':'fooON', 'R':'R', 'S':'S'}
df['grouped_status'] = df.status.map(lookup)

输出

            datetime  dty       dtx   status grouped_status
2018-09-16 04:38:17 0.0 0.099854 F-On On
2018-09-16 04:38:18 0.0 0.100098 F-On On
2018-09-16 04:38:19 0.0 0.000000 S-On On
2018-09-16 04:38:20 0.0 0.100098 F-On On
2018-09-16 04:38:21 0.0 0.100098 circ fooON
2018-09-16 04:38:22 0.0 0.100098 circInS fooON
2018-09-16 04:38:21 0.0 0.100098 TH fooON
2018-09-16 04:38:21 0.0 0.100098 R R
2018-09-16 04:38:21 0.0 0.100098 S S

关于python - 基于另一个列值创建列,基于将值分配给输入列中的字符串值集,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54602005/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com