gpt4 book ai didi

python - 根据现有列中的值计算新列

转载 作者:行者123 更新时间:2023-12-04 08:17:41 25 4
gpt4 key购买 nike

我有以下数据框:

df = pd.DataFrame(
{
"customer": ['c1', 'c2', 'c3', 'c4', 'c5'],
"contract_year": [2018, 2020, 2019, 2018, 2019],
"amount": [3000, 1000, 3000, 6000, 6000],
"term": [3, 1, 2, 3, 3]
}
)

customer contract_year amount term
0 c1 2018 3000 3
1 c2 2020 1000 1
2 c3 2019 3000 2
3 c4 2018 6000 3
4 c5 2019 6000 3
我的目标是:对于每个客户,将金额除以“期限”年数;例如:
客户 c1,将付款
df["amount"]/df["term"] 
从“contract_year”开始的下一个“任期”年。这些金额应在每个付款年度的新列中。
最终结果应如下所示:
    customer    contract_year   amount  term    2018   2019   2020    2021
0 c1 2018 3000 3 1000 1000 1000
1 c2 2020 1000 1 1000
2 c3 2019 3000 2 1500 1500
3 c4 2018 6000 3 2000 2000 2000
4 c5 2019 6000 3 2000 2000 2000
提前谢谢了!

最佳答案

让我们做:

s = df.reindex(df.index.repeat(df['term']))
s['val'] = s['amount'].floordiv(s['term'])
s['year'] = s['contract_year'] + s.groupby(level=0).cumcount()

s.pivot_table('val', [*df.columns], 'year', aggfunc='first').reset_index()
详情 : reindex使用 index.repeat 的数据框:
print(s)

customer contract_year amount term
0 c1 2018 3000 3
0 c1 2018 3000 3
0 c1 2018 3000 3
1 c2 2020 1000 1
2 c3 2019 3000 2
2 c3 2019 3000 2
3 c4 2018 6000 3
3 c4 2018 6000 3
3 c4 2018 6000 3
4 c5 2019 6000 3
4 c5 2019 6000 3
4 c5 2019 6000 3
amount通过 term为了在 term 的数量之间平均分配金额年:
print(s)

customer contract_year amount term val
0 c1 2018 3000 3 1000
0 c1 2018 3000 3 1000
0 c1 2018 3000 3 1000
1 c2 2020 1000 1 1000
2 c3 2019 3000 2 1500
2 c3 2019 3000 2 1500
3 c4 2018 6000 3 2000
3 c4 2018 6000 3 2000
3 c4 2018 6000 3 2000
4 c5 2019 6000 3 2000
4 c5 2019 6000 3 2000
4 c5 2019 6000 3 2000
根据 level=0 创建顺序计数器使用 cumcount 分组,然后将此计数器添加到 contract_year为了生成下一学期:
print(s)

customer contract_year amount term val year
0 c1 2018 3000 3 1000 2018
0 c1 2018 3000 3 1000 2019
0 c1 2018 3000 3 1000 2020
1 c2 2020 1000 1 1000 2020
2 c3 2019 3000 2 1500 2019
2 c3 2019 3000 2 1500 2020
3 c4 2018 6000 3 2000 2018
3 c4 2018 6000 3 2000 2019
3 c4 2018 6000 3 2000 2020
4 c5 2019 6000 3 2000 2019
4 c5 2019 6000 3 2000 2020
4 c5 2019 6000 3 2000 2021
使用 pivot_table reshape 数据框:
year customer  contract_year  amount  term    2018    2019    2020    2021
0 c1 2018 3000 3 1000.0 1000.0 1000.0 NaN
1 c2 2020 1000 1 NaN NaN 1000.0 NaN
2 c3 2019 3000 2 NaN 1500.0 1500.0 NaN
3 c4 2018 6000 3 2000.0 2000.0 2000.0 NaN
4 c5 2019 6000 3 NaN 2000.0 2000.0 2000.0

关于python - 根据现有列中的值计算新列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65642320/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com