gpt4 book ai didi

python - 透视 Pandas Dataframe,没有数字类型,索引不是唯一的

转载 作者:行者123 更新时间:2023-12-04 13:45:54 25 4
gpt4 key购买 nike

我正在尝试将一些字符串数据转换为列,但很难使用 past responses因为我没有可以使用的唯一索引或多索引。

样本格式

  index	location	field	        value
1 location1 firstName A
2 location1 lastName B
3 location1 dob C
4 location1 email D
5 location1 title E
6 location1 address1 F
7 location1 address2 G
8 location1 address3 H
9 location1 firstName I
10 location1 lastName J
11 location1 dob K
12 location1 email L
13 location1 title M
14 location1 address1 N
15 location1 address2 O
16 location1 address3 P
40 location2 firstName Q
41 location2 lastName R
42 location2 dob S
43 location2 email T
44 location2 title U
45 location2 address1 V
46 location2 address2 W
47 location2 address3 X


我想转向的格式:

location	firstName lastName dob email title address1	address2 address3
location1 A B C D E F G H
location1 I J K L M N O P
location2 Q R S T U V W X


我最接近实现这一点的是使用 aggfuc='first',但这我需要每个位置的所有值,而不仅仅是第一个。

我想转向的格式:

df = df.pivot_table(index='location',columns='field',values='value',aggfunc='first')

最佳答案

您需要使用代理列进行透视。这是使用 cumsum 的解决方案+ set_index + unstack .

v = df.set_index(['location', 'field', df.field.eq('firstName').cumsum()]).unstack(-2) 
v.index = v.index.droplevel(-1)
v.columns = v.columns.droplevel(0)

field address1 address2 address3 dob email firstName \
location
location1 F G H C D A
location1 N O P K L I
location2 V W X S T Q

field lastName title
location
location1 B E
location1 J M
location2 R U

关于python - 透视 Pandas Dataframe,没有数字类型,索引不是唯一的,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48630089/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com