gpt4 book ai didi

Pandas Find distance amoung the group(大熊猫在群体中找到了距离)

转载 作者:bug小助手 更新时间:2023-10-24 21:42:37 28 4
gpt4 key购买 nike



Given data set contains,

给定数据集包含,



Brand | Sector|Year|Price|Sales.




B1, S1, 2023, 45900, 400




B1, S1, 2022, 45000, 500




B2, S1, 2022, 45400, 520



The Group may be defined as Same Brand and Sector, and the distance may be defined as change in Price by Change in Sale and with in a group. Lets say there are 4 (A,B,C,D) members for Brand B1 and Sector S1, each member having its own Count of Sales, and Price. We can form 6 Pairs,
AB, AC,AD,BC,BD,CD and from each pair we could find Change is Price by Change in Sales.
(A.Price - B.Price ) / (A.Sale - B.Sale )

集团可以被定义为相同的品牌和行业,距离可以被定义为通过销售的变化和与集团的变化而发生的价格变化。假设品牌B1和部门S1有4个成员(A、B、C、D),每个成员都有自己的销售额和价格。我们可以形成6双,AB,AC,AD,BC,BD,CD,从每一双中我们可以发现价格随着销售的变化而变化。(A.价格-B.价格)/(A.销售-B.销售)


Tried things like df.diff(), df.roll() etc. But that can compare only one element next to another.

尝试了df.diff()、df.roll()等方法,但这只能将一个元素与另一个元素进行比较。


更多回答

Please add a more complete data set (showing the members that you talk about) and your expected output.

请添加更完整的数据集(显示您所谈论的成员)和您的预期输出。

优秀答案推荐

It's typical that when you want pair-wise calculation within a group, you want to merge on the key, here Brand and Sector. Then what you need is just a straightforward calculation:

典型的情况是,当您想要在一个组内进行配对计算时,您想要合并关键字,这里是Brand和Sector。那么你需要的只是一个简单的计算:


(df.merge(df, on=['Brand','Sector'])
.query('Year_x != Year_y')
.assign(Change=lambda x: x['Price_x'].sub(x['Price_y']).div(x['Sales_x']-x['Sales_y']))
)

Output:

产出:


  Brand Sector  Year_x  Price_x  Sales_x  Year_y  Price_y  Sales_y  Change
1 B1 S1 2023 45900 400 2022 45000 500 -9.0
2 B1 S1 2022 45000 500 2023 45900 400 -9.0

更多回答

Should it be 'Year_x < Year_y' , to avoid Duplication, Like AB calculation is same as BA ?

应该是‘Year_x

Yes, but in the question you requested 6 pairs, not 3 pairs...

是的,但在问题中你要的是6双,而不是3双……

Yes, A, B, C, D, taken 2 at a time without order we get 6 pairs, AB, AC,AD, BC,BD,CD , Couting BA, CA etc will create a kind of duplicate. Anyways I think 'Year_x < Year_y' would help to solve

是的,A,B,C,D,一次取2个,没有顺序我们得到6对,AB,AC,AD,BC,BD,CD,Couting BA,CA等会产生一种复制。无论如何,我认为‘Year_x

28 4 0