gpt4 book ai didi

python - 按列值将多行合并为一行,并根据连接的行数拆分为多个数据帧,用于多列

转载 作者:行者123 更新时间:2023-12-01 07:05:43 25 4
gpt4 key购买 nike

这是这个问题的后续:Concatenate several rows into one row by column value, and split resulting dataframe into several dataframes based on number of concatinated rows

这展示了在需要合并一列和 1 个额外列的情况下如何合并行。

我现在正在寻找一种针对有很多列的情况的解决方案,并且我仍然想根据一列组合行。

我希望的处理方式是:首先列出一种类型的所有列,然后按照与第一次相同的顺序列出另一种类型的列。

这是一个最小的例子

data = [['tom', 'ca', 2], ['ni2ck', 'ma', 2], ['j3uli', 'ny', 4] , ['nic4k', 'ma', 4], ['jul5i', 'ny', 4] , ['nic6k', 'ma', 7], ['ju7li', 'ny', 7] , ['nic8k', 'ma', 7], ['ju9li', 'ny', 7] , ['nic1k', 'ma', 8], ['car', 'ny', 8]]
df = pd.DataFrame(data, columns = ['Name', 'Location', 'Age'])
df

结果是

Name    Location    Age
0 tom ca 2
1 ni2ck ma 2
2 j3uli ny 4
3 nic4k ma 4
4 jul5i ny 4
5 nic6k ma 7
6 ju7li ny 7
7 nic8k ma 7
8 ju9li ny 7
9 nic1k ma 8
10 car ny 8

这就是想要的结果

    Name    Name    Location    Location    Age
0 tom ni2ck ca ma 2
1 nic1k car ma ny 8


Name Name Name Location Location Location Age
0 j3uli nic4k jul5i ny ma ny 4


Name Name Name Name Location Location Location Location Age
0 nic6k ju7li nic8k ju9li ma ny ma ny 7

重要的是,正确的位置与相应名称的顺序相同。

最佳答案

从@Wen解决方案开发。使用 pivot_table 代替 pivot

df['New']=df.groupby('Age').cumcount()
s= df.pivot_table(index='Age',columns='New',
values=['Name', 'Location'],
aggfunc='first').reindex(['Name', 'Location'], axis=1, level=0)
s.columns = s.columns.map('{0[0]}{0[1]}'.format)

l=[y.dropna(1).reset_index() for _ , y in s.groupby(s.isnull().sum(1))]

In [499]: l[0]
Out[499]:
Age Name0 Name1 Name2 Name3 Location0 Location1 Location2 Location3
0 7 nic6k ju7li nic8k ju9li ma ny ma ny

In [500]: l[1]
Out[500]:
Age Name0 Name1 Name2 Location0 Location1 Location2
0 4 j3uli nic4k jul5i ny ma ny

In [501]: l[2]
Out[501]:
Age Name0 Name1 Location0 Location1
0 2 tom ni2ck ca ma
1 8 nic1k car ma ny
<小时/>

如果您想保留多索引列,请跳过列上的 map 命令

df['New']=df.groupby('Age').cumcount()
s= df.pivot_table(index='Age',columns='New',
values=['Name', 'Location'],
aggfunc='first').reindex(['Name', 'Location'], axis=1, level=0)

l=[y.dropna(1).reset_index() for _ , y in s.groupby(s.isnull().sum(1))]

In [544]: l[0]
Out[544]:
Age Name Location
New 0 1 2 3 0 1 2 3
0 7 nic6k ju7li nic8k ju9li ma ny ma ny

In [545]: l[1]
Out[545]:
Age Name Location
New 0 1 2 0 1 2
0 4 j3uli nic4k jul5i ny ma ny

In [546]: l[2]
Out[546]:
Age Name Location
New 0 1 0 1
0 2 tom ni2ck ca ma
1 8 nic1k car ma ny

关于python - 按列值将多行合并为一行,并根据连接的行数拆分为多个数据帧,用于多列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58460698/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com