gpt4 book ai didi

python - Pandas Groupby 对特定列的聚合功能,在结果中显示所有列

转载 作者:行者123 更新时间:2023-12-04 08:56:21 25 4
gpt4 key购买 nike

我想要一个基于 id 的 grouby 和 sum,但结果显示所有列。
示例代码

import pandas as pd
import numpy as np

mre = [
["2018-1", "Sold", 109000.0, "Appartement", 73.0, 4.0],
["2018-1", "Sold", 109000.0, "Appartement", "NaN", 0.0],
["2018-2", "Sold", 239300.0, "House", 163.0, 4.0],
["2018-2", "Sold", 239300.0, "House", 51.0, 2.0],
["2018-2", "Sold", 239300.0, "House", 51.0, 2.0]
]

df = pd.DataFrame(mre)

# Rename columns
df.columns = ["_idMutation", "typeOfSearch",
"price", "typeOfBuilding", "surface", "nbRoom"]

df["surface"] = df["surface"].astype(float)

print(df)
基础数据帧
  _idMutation typeOfSearch     price typeOfBuilding  surface  nbRoom
0 2018-1 Sold 109000.0 Appartement 73.0 4.0
1 2018-1 Sold 109000.0 Appartement NaN 0.0
2 2018-2 Sold 239300.0 House 163.0 4.0
3 2018-2 Sold 239300.0 House 51.0 2.0
4 2018-2 Sold 239300.0 House 51.0 2.0
预期成绩 groupby基于 _idMutation ,总和 surface和总和 nbRoom ,但不影响其他行。我想显示所有列,删除重复项 _idMutation并显示 groupby 的结果
  _idMutation typeOfSearch     price typeOfBuilding surface  nbRoom
0 2018-1 Sold 109000.0 Appartement 73.0 4.0
1 2018-2 Sold 239300.0 House 265.0 8.0
当前代码
以下解决方案产生了预期的结果。我有 1460 万行,我提出的解决方案看起来没有优化。
# Groupby on _idMutation & sum ["surface", "nbRoom"]
gb_df = df[["surface", "nbRoom"]].groupby(df["_idMutation"]).sum()

# Delete duplicates _idMutation
df.drop_duplicates(subset=["_idMutation"], inplace=True)

# Set _idMutation as df index
df.set_index("_idMutation", inplace=True)

# Concat df with gb_df
df = pd.concat(
[df[["typeOfSearch", "price", "typeOfBuilding"]], gb_df], axis=1)

最佳答案

我们可以使用 GroupBy.agg并使用字典为每列设置我们想要的聚合方法。在这种情况下,我们只需要 firstsum :

dfg = df.groupby("_idMutation", as_index=False).agg({
"typeOfSearch": "first",
"price": "first",
"typeOfBuilding": "first",
"surface": "sum",
"nbRoom": "sum"
})
  _idMutation typeOfSearch     price typeOfBuilding  surface  nbRoom
0 2018-1 Sold 109000.0 Appartement 73.0 4.0
1 2018-2 Sold 239300.0 House 265.0 8.0

关于python - Pandas Groupby 对特定列的聚合功能,在结果中显示所有列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63814565/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com