gpt4 book ai didi

Python pandas 使用带有自定义 agg 函数的 groupby 创建新列

转载 作者:太空宇宙 更新时间:2023-11-03 14:46:22 26 4
gpt4 key购买 nike

我的数据框:

from random import random, randint
from pandas import DataFrame

t = DataFrame({"metasearch":["A","B","A","B","A","B","A","B"],
"market":["A","B","A","B","A","B","A","B"],
"bid":[random() for i in range(8)],
"clicks": [randint(0,10) for i in range(8)],
"country_code":["A","A","A","A","A","B","A","B"]})

我想为每个 market 拟合 LinearRegression,所以我:

1) 组 df - groups = t.groupby(by="market")

2) 准备函数以在组上拟合模型 -

from sklearn.linear_model import LinearRegression
def group_fitter(group):
lr = LinearRegression()
X = group["bid"].fillna(0).values.reshape(-1,1)
y = group["clicks"].fillna(0)
lr.fit(X, y)
return lr.coef_[0] # THIS IS A SCALAR

3) 创建一个以market为索引,以coef为值的新系列:

s = groups.transform(group_fitter) 

但是第 3 步失败了: KeyError: ('bid_cpc', 'occurred at index bid')

最佳答案

我认为您需要使用 transform 而不是 apply,因为在函数中同时使用更多列,对于新列使用 join :

from sklearn.linear_model import LinearRegression
def group_fitter(group):
lr = LinearRegression()
X = group["bid"].fillna(0).values.reshape(-1,1)
y = group["clicks"].fillna(0)
lr.fit(X, y)
return lr.coef_[0] # THIS IS A SCALAR

groups = t.groupby(by="market")
df = t.join(groups.apply(group_fitter).rename('new'), on='market')
print (df)
bid clicks country_code market metasearch new
0 0.462734 9 A A A -8.632301
1 0.438869 5 A B B 6.690289
2 0.047160 9 A A A -8.632301
3 0.644263 0 A B B 6.690289
4 0.579040 0 A A A -8.632301
5 0.820389 6 B B B 6.690289
6 0.112341 5 A A A -8.632301
7 0.432502 0 B B B 6.690289

关于Python pandas 使用带有自定义 agg 函数的 groupby 创建新列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48973793/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com