gpt4 book ai didi

python - pandas:如何保持每个组的最后 `n` 记录按另一个变量排序?

转载 作者:太空宇宙 更新时间:2023-11-03 11:12:32 26 4
gpt4 key购买 nike

我想保留每个组的最后 n 行,使用 pandas 按变量 var_to_sort 排序。

这就是我现在要做的,我想按 name 对下面的数据框进行分组,然后按 date sort 然后使用tail(n) 以按组获取最后的 n 元素。

data = [
['tom', date(2018,2,1), "I want this"],
['tom', date(2018,1,1), "Don't want"],
['nick', date(2019,4,1), "Don't want"],
['nick', date(2019,5,1), "I want this"]]

# Create the pandas DataFrame
df = pd.DataFrame(data)
df.columns = ["names", "date", "result"]

# sort it
df.sort_values("date", inplace=True)

df.groupby("names").tail(1)

有没有更有效的方法来做到这一点?如果数据集已按 "date"["date", "name"] 索引怎么办?

最佳答案

我认为你的解决方案非常好,也可以使用 sort_values 而不用 inplace 一起用于链代码。

其他问题:

data = [
['tom', date(2018,2,1), "I want this"],
['tom', date(2018,1,1), "Don't want"],
['nick', date(2019,4,1), "Don't want"],
['nick', date(2019,5,1), "I want this"]]

# Create the pandas DataFrame
df = pd.DataFrame(data)
df.columns = ["names", "date", "result"]

df1 = df.sort_values("date").groupby("names").tail(1)
print (df1)
names date result
0 tom 2018-02-01 I want this
3 nick 2019-05-01 I want this

df2 = df.set_index('date')
print (df2)
names result
date
2018-02-01 tom I want this
2018-01-01 tom Don't want
2019-04-01 nick Don't want
2019-05-01 nick I want this

df22 = df2.sort_index().groupby("names").tail(1)
print (df22)
names result
date
2018-02-01 tom I want this
2019-05-01 nick I want this

df3 = df.set_index(['date','names'])
print (df3)
result
date names
2018-02-01 tom I want this
2018-01-01 tom Don't want
2019-04-01 nick Don't want
2019-05-01 nick I want this

df33 = df3.sort_index().groupby(level=1).tail(1)
print (df33)
result
date names
2018-02-01 tom I want this
2019-05-01 nick I want this

关于python - pandas:如何保持每个组的最后 `n` 记录按另一个变量排序?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57550884/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com