gpt4 book ai didi

python - 如何将缺失行的记录添加到 Dataframe

转载 作者:行者123 更新时间:2023-12-04 04:10:22 25 4
gpt4 key购买 nike

我有一个数据框:df

    Name        Date ID  Amount
0 Faye 2018-12-31 A 2
1 Faye 2019-03-31 A 1
2 Faye 2019-06-30 A 5
3 Faye 2019-09-30 B 2
4 Faye 2019-09-30 C 2
5 Faye 2019-12-31 A 4
6 Faye 2020-03-01 A 1
7 Faye 2020-03-01 B 1
8 Mike 2018-12-31 A 4
9 Mike 2019-03-31 A 4
10 Mike 2019-06-30 B 3

对于每个 NameDateID,我计算了 Amount 的百分比变化新列中的上一个 Date。如果没有以前的条目,那么我添加 New:

df['% Change'] = (df.sort_values('Date').groupby(['Name', 'ID']).Amount.pct_change())
df['% Change'] = df['% Change'].fillna('New')

但我还想为相反的情况创建一个条目,其中 NameDateID、group did 以前存在,但在下一个日期不存在;这样输出看起来像:

    Name        Date ID  Amount % Change
0 Faye 2018-12-31 A 2 New
1 Faye 2019-03-31 A 1 -0.5
2 Faye 2019-06-30 A 5 4
3 Faye 2019-09-30 A 0 Sold
4 Faye 2019-09-30 B 2 New
5 Faye 2019-09-30 C 2 New
6 Faye 2019-12-31 A 4 New
7 Faye 2020-03-01 A 1 -0.75
8 Faye 2020-03-01 B 1 -0.5
9 Mike 2018-12-31 A 4 New
10 Mike 2019-03-31 A 4 0
11 Mike 2019-06-30 A 0 Sold
12 Mike 2019-06-30 B 3 New

如果有帮助,我正在尝试模拟 this site 的方式处理此类案件。

最佳答案

解决方法:

# run you original code to ide
df['% Change'] = (df.sort_values('Date').groupby(['Name', 'ID']).Amount.pct_change())
df['% Change'] = df['% Change'].fillna('New')

# Create a dataframe of all te dates.
all_dates = pd.DataFrame({"Date": df["Date"].unique()})
all_dates["one"] = 1

# Create a dasta frame of all the possible recored (all combinations of id-name-date)
name_ids = df[["Name", "ID"]].drop_duplicates()
name_ids["one"] = 1

all_possible_records = pd.merge(all_dates, name_ids, on="one")
all_possible_records = pd.merge(all_possible_records, df, on = ["Date", "Name", "ID"], how ="left")
all_possible_records.drop("one", axis = "columns", inplace = True)
all_possible_records.sort_values(["Name", "ID", "Date"], inplace=True)

# For every record, shift 1 to see if it had any value in the previous querter.
all_possible_records["prev_q"] = all_possible_records.groupby(["Name", "ID"]).shift(1)["Amount"]

# records in which change is NaN - but there was a value in the previous querter - are 'sold'
all_possible_records.loc[all_possible_records["% Change"].isna() & all_possible_records.prev_q.notna(), "% Change"]="Sold"

# Drop redundent records.
res = all_possible_records.dropna(axis="rows", subset=["% Change"])
res

结果是:

enter image description here

关于python - 如何将缺失行的记录添加到 Dataframe,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61843328/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com