gpt4 book ai didi

python - 将数量 X 随机分配给 n 个项目

转载 作者:太空宇宙 更新时间:2023-11-03 19:46:53 24 4
gpt4 key购买 nike

以下示例数据可用:

df_1

DATE      AMOUNT
20170101 50000
20170102 55000
20170103 60000
20170104 65000
... ...
20171231 150000

df_2

DATE      INVNR   AMOUNT
20170101 ... ???
20170101 ... ???
20170101 ... ???
20170101 ... ???
20170102 ... ???
20170102 ... ???
20170103 ... ???
... ... ???
20171231 ... ???
20171231 ... ???
20171231 ... ???
20171231 ... ???

这是预期的示例输出:

df_ramdomly_distributed_amounts

DATE      INVNR   AMOUNT
20170101 ... 14879
20170101 ... 6523
20170101 ... 8596
20170101 ... 20002
20170102 ... 31548
20170102 ... 23452
20170103 ... 60000
... ... ???
20171231 ... 26428
20171231 ... 72658
20171231 ... 326
20171231 ... 50593

INVNR包含发票编号,但这些是系统导出的并且不一致DATE 始终采用给定的格式,并且可以根据需要进行操作(df_1 + df_2 目前将其保留为字符串列)。

因此,任务是将 df_1 中存储的每个唯一日期的 AMOUNT 以 1:m 关系随机分配到 df_2 中的发票。每天的发票金额随时间变化。 df_2 目前持有约 200,000 张发票。我现在无法解决解决方案。非常感谢您的任何提示!

最佳答案

试试这个:-

import pandas as pd
from random import randint

# intialise data of lists & Create DataFrame
data1 = {'DATE': ['20170101', '20170102', '20170103', '20170104'],
'AMOUNT': ['500', '600', '400', '800']}
df_1 = pd.DataFrame(data1)
data2 = {'DATE': ['20170101', '20170101', '20170101', '20170102', '20170102', '20170103', '20170103', '20170104'],
'INVNR': ['a1', 'a2', 'a3', 'b1', 'b2', 'c1', 'c2', 'd1']}

df_2 = pd.DataFrame(data2)

print("------------- DF 1 --------------------")
print(df_1)
print("------------- DF 2 --------------------")
print(df_2)

# process each date group and split the amount among them
def splitAndAddAmount(gr, df_am):
amount = int(df_am.loc[df_am['DATE'] == gr['DATE'].iloc[0], 'AMOUNT'].iloc[0])
grpLength = int(len(gr))
assert amount >= grpLength >= 1
pieces = []
for idx in range(grpLength - 1):
pieces.append(randint(1, amount - sum(pieces) - grpLength + idx))
pieces.append(amount - sum(pieces))
gr['AMNT'] = pieces
return gr


print("------------- DF FINAL --------------------")
df_final = df_2.groupby('DATE').apply(splitAndAddAmount, df_1)
print(df_final)

输出:-

------------- DF 1 --------------------
DATE AMOUNT
0 20170101 500
1 20170102 600
2 20170103 400
3 20170104 800
------------- DF 2 --------------------
DATE INVNR
0 20170101 a1
1 20170101 a2
2 20170101 a3
3 20170102 b1
4 20170102 b2
5 20170103 c1
6 20170103 c2
7 20170104 d1
------------- DF FINAL --------------------
DATE INVNR AMNT
0 20170101 a1 29
1 20170101 a2 148
2 20170101 a3 323
3 20170102 b1 23
4 20170102 b2 577
5 20170103 c1 261
6 20170103 c2 139
7 20170104 d1 800

关于python - 将数量 X 随机分配给 n 个项目,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60072065/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com