gpt4 book ai didi

python - 按特定单元格划分数据框列

转载 作者:行者123 更新时间:2023-12-04 15:11:50 25 4
gpt4 key购买 nike

我想将数据帧列除以同一数据帧中的特定单元格。

我有一个这样的数据框:

 date      type          score   
20201101 experiment1 30
20201101 experiment2 20
20201101 baseline 10
20201102 experiment1 60
20201102 experiment2 50
20201102 baseline 10

我想通过将分数除以该日期的“基线”分数来计算 score_ratio。

 date      type          score   score_ratio
20201101 experiment1 30 3
20201101 experiment2 20 2
20201101 baseline 10 1
20201102 experiment1 60 6
20201102 experiment2 50 5
20201102 baseline 10 1

(date, type) = (20201101, experiment1) 的 score_ratio 应该通过将其分数除以 (20201101, baseline) 的分数来获得。在这种情况下,它应该是 30/10 = 3。同理。对于 (20201101, experiment2),我们应该将分数除以相同的东西,(20201101, baseline)。对于不同的日期,比如 (20201102, experiment1),它应该除以该日期的基线,(20201102, baseline)

如何使用数据框操作添加此列?

到目前为止,我有这个但不确定我应该除以什么表达式:df['score_ratio'] = df['score'].div(...)

编辑:

我得到最后一行的错误ValueError: 值的长度与索引的长度不匹配

     ID    date        type          room    score         
0 id1 20201120 baseline 1 450.25
0 id2 20201120 experiment1 1 -3637.24
0 id3 20201121 baseline 1 200.00
1 id4 20201121 experiment1 1 300.00
2 id5 20201120 baseline 2 600.00
3 id6 20201120 experiment1 2 800.00


_df = df.set_index('date', 'room')
d = _df.query('type=="baseline"')
print(_df['score'].div(d['score']).values)
df['score_ratio'] = _df['score'].div(d['score']).values

最佳答案

 #Mask all whose type is baseline into a new datframe and merge to the main df
g=pd.merge(df, df[df.type.eq('baseline')].drop(columns='type'),how='left', on='date', suffixes=('', '_right'))

#Calculate the score_ratio and drop the extra column acquired during merge
df=g.assign(score_ratio=g.score.div(g.score_right).astype(int)).drop(columns=['score_right'])

print(df)

date type score score_ratio
0 20201101 experiment1 30 3
1 20201101 experiment2 20 2
2 20201101 baseline 10 1
3 20201102 experiment1 60 6
4 20201102 experiment2 50 5
5 20201102 baseline 10 1

工作原理

#New dataframe with baselines only
df1=df[df.type.eq('baseline')].drop(columns='type')

#Modified original dataframe with baselines added
g=pd.merge(df, df1,how='left', on='date', suffixes=('', '_right'))

#new column called score_ratio
g=g.assign(score_ratio=g.score.div(g.score_right).astype(int))

#drop column called score_right which was acquired during merge
g=g.drop(columns=['score_right'])

关于python - 按特定单元格划分数据框列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65085165/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com