gpt4 book ai didi

python - Pandas 在分组后找到中位数

转载 作者:行者123 更新时间:2023-12-05 07:03:30 25 4
gpt4 key购买 nike

enter image description here

df.head(10).to_clipboard(sep=';', index=True)

我有一个如上所述的数据框,我有以下列描述

•   Id - the uuid of this delivery
• PlanId - the uuid of the plan (the plan for deliveries of a given day)
• PlanDate - the date of delivery

• MinTime - the minimal time (seconds from midnight) for delivering this delivery
• MaxTime - the maximal time (seconds from midnight) for delivering this delivery
• RouteId - the uuid of the route this delivery belongs to
• ETA - the estimated time for arrival of this delivery on this date (from the eta you can of course order the deliveries in a route)
• TTN - the time to next delivery in the route, i.e., at index 3 that would be the time distance between delivery index 3 and delivery index 4
• DTN - the distance to next delivery in the route.

我需要找到给定计划中每条路线的交货中位数。

给定计划中每条路线行驶的中间距离。

给定计划中每条路线的平均行驶时间。

我该怎么做?

我想知道这是否只是简单的中位数计算,您只需分组并汇总我试过这样的方法来找到中值距离

Tx = df.groupby(by=['plan_id','route_id'], as_index=False)['dtn'].sum()


Tx.groupby(['plan_id','route_id'])['dtn'].median()

但是我不确定这是否正确。

最佳答案

以下是显示所需数字的方法:

# Subset dataframe to only have the desired plan_id
sub_Tx = Tx[Tx['plan_id'] == '869BB6FB-.....']

# median of deliveries per route in the given plan
sub_df = sub_Tx[['plan_id', 'route_id']]
sub_df['count_deliveries'] = 1
sub_df = sub_df.groupby(by=['plan_id', 'route_id'], axis=0, as_index=False).sum()
sub_df.groupby(by=['plan_id', 'route_id'], axis=0, as_index=False).median()

# median distance travelled per route in the given plan
sub_df = sub_Tx[['plan_id', 'route_id', 'dtn']]
sub_df = sub_df.groupby(by=['plan_id', 'route_id'], axis=0, as_index=False).sum()
sub_df.groupby(by=['plan_id', 'route_id'], axis=0, as_index=False).median()

# median time travelled per route in the given plan
sub_df = sub_Tx[['plan_id', 'route_id', 'ttn']]
sub_df = sub_df.groupby(by=['plan_id', 'route_id'], axis=0, as_index=False).sum()
sub_df.groupby(by=['plan_id', 'route_id'], axis=0, as_index=False).median()

祝你好运

更新:

因此您可以计算每个 plan_id 的路线数字(nb 交付、距离和时间)的中位数,如下所示:

# median of deliveries per route in the given plan
sub_df = sub_Tx[['plan_id', 'route_id']]
sub_df['count_deliveries'] = 1
sub_df = sub_df.groupby(by=['plan_id', 'route_id'], axis=0, as_index=False).sum()
sub_df = sub_df[['plan_id', 'count_deliveries']].rename(columns={'count_deliveries': 'median_deliveries'})
sub_df.groupby(by=['plan_id'], axis=0, as_index=False).median()

# median distance travelled per route in the given plan
sub_df = sub_Tx[['plan_id', 'route_id', 'dtn']]
sub_df = sub_df.groupby(by=['plan_id', 'route_id'], axis=0, as_index=False).sum()
sub_df = sub_df[['plan_id', 'dtn']].rename(columns={'dtn': 'median_dtn'})
sub_df.groupby(by=['plan_id'], axis=0, as_index=False).median()

# median time travelled per route in the given plan
sub_df = sub_Tx[['plan_id', 'route_id', 'ttn']]
sub_df = sub_df.groupby(by=['plan_id', 'route_id'], axis=0, as_index=False).sum()
sub_df = sub_df[['plan_id', 'ttn']].rename(columns={'ttn': 'median_ttn'})
sub_df.groupby(by=['plan_id'], axis=0, as_index=False).median()

关于python - Pandas 在分组后找到中位数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63186507/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com