作者热门文章
- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我有一个数据框:df
Name Date ID Amount
0 Faye 2018-12-31 A 2
1 Faye 2019-03-31 A 1
2 Faye 2019-06-30 A 5
3 Faye 2019-09-30 B 2
4 Faye 2019-09-30 C 2
5 Faye 2019-12-31 A 4
6 Faye 2020-03-01 A 1
7 Faye 2020-03-01 B 1
8 Mike 2018-12-31 A 4
9 Mike 2019-03-31 A 4
10 Mike 2019-06-30 B 3
对于每个 Name
、Date
、ID
,我计算了 Amount
的百分比变化新列中的上一个 Date
。如果没有以前的条目,那么我添加 New
:
df['% Change'] = (df.sort_values('Date').groupby(['Name', 'ID']).Amount.pct_change())
df['% Change'] = df['% Change'].fillna('New')
但我还想为相反的情况创建一个条目,其中 Name
、Date
、ID
、group did 以前存在,但在下一个日期不存在;这样输出看起来像:
Name Date ID Amount % Change
0 Faye 2018-12-31 A 2 New
1 Faye 2019-03-31 A 1 -0.5
2 Faye 2019-06-30 A 5 4
3 Faye 2019-09-30 A 0 Sold
4 Faye 2019-09-30 B 2 New
5 Faye 2019-09-30 C 2 New
6 Faye 2019-12-31 A 4 New
7 Faye 2020-03-01 A 1 -0.75
8 Faye 2020-03-01 B 1 -0.5
9 Mike 2018-12-31 A 4 New
10 Mike 2019-03-31 A 4 0
11 Mike 2019-06-30 A 0 Sold
12 Mike 2019-06-30 B 3 New
如果有帮助,我正在尝试模拟 this site 的方式处理此类案件。
最佳答案
解决方法:
# run you original code to ide
df['% Change'] = (df.sort_values('Date').groupby(['Name', 'ID']).Amount.pct_change())
df['% Change'] = df['% Change'].fillna('New')
# Create a dataframe of all te dates.
all_dates = pd.DataFrame({"Date": df["Date"].unique()})
all_dates["one"] = 1
# Create a dasta frame of all the possible recored (all combinations of id-name-date)
name_ids = df[["Name", "ID"]].drop_duplicates()
name_ids["one"] = 1
all_possible_records = pd.merge(all_dates, name_ids, on="one")
all_possible_records = pd.merge(all_possible_records, df, on = ["Date", "Name", "ID"], how ="left")
all_possible_records.drop("one", axis = "columns", inplace = True)
all_possible_records.sort_values(["Name", "ID", "Date"], inplace=True)
# For every record, shift 1 to see if it had any value in the previous querter.
all_possible_records["prev_q"] = all_possible_records.groupby(["Name", "ID"]).shift(1)["Amount"]
# records in which change is NaN - but there was a value in the previous querter - are 'sold'
all_possible_records.loc[all_possible_records["% Change"].isna() & all_possible_records.prev_q.notna(), "% Change"]="Sold"
# Drop redundent records.
res = all_possible_records.dropna(axis="rows", subset=["% Change"])
res
结果是:
关于python - 如何将缺失行的记录添加到 Dataframe,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61843328/
我是一名优秀的程序员,十分优秀!