gpt4 book ai didi

python - 用新的列名迭代合并 Pandas 列

转载 作者:行者123 更新时间:2023-12-04 17:11:03 25 4
gpt4 key购买 nike

假设我在循环中迭代地合并 Pandas 数据框,但在两到三次迭代后 Pandas 重复列名,例如考虑以下示例,其中我迭代地合并列但为简单起见没有循环:

A= {'Name':['A','B','C'],'GPA':[4.0,3.80,3.70], 'School':['U','U','U'], 'Time':[22,26,30]}
A1 = pd.DataFrame(A)
B= {'Name':['D','E','F'],'GPA':[3.50,3.70,3.60], 'School':['S','S','S'],'Time':[34,44,54]}
B1 = pd.DataFrame(B)
C= {'Name':['G','H','I'],'GPA':[3.70,3.50,3.70], 'School':['C','C','C'],'Time':[76,86,96]}
C1 = pd.DataFrame(C)
L= [A1,B1,C1]
comb = A1
for ii in L[1:]:
comb = pd.concat([comb,ii],ignore_index=True)
comb
enter image description here
B = pd.merge(comb, comb, on=['Name','GPA'])
C = pd.merge(B, comb, on=['Name','GPA'])
D = pd.merge(C, comb, on=['Name','GPA'])
enter image description here
您看到 Panda 将 School_x 和 School_y 名称重复了两次,是否可以将其更改为 School_x 和 School_y、School_z 和 School_t。我不是在谈论之后重命名它,而是强制合并为不同的列选择新的列名。否则如何区分具有 1000 列的数据框并想象 500 列具有相同的列名。
更新 :以上只是一个示例,假设您正在循环中合并多个数据帧,如下所示:
  for ii in list:
df = df.merge(A,on = 'some column', how = 'outer')
那么你如何迭代地更改列名,在我看来,即使每次使用后缀也会重复相同的列。

最佳答案

尝试更改 suffixes ('_z', '_t') 元组的参数:

B = pd.merge(comb, comb, on=['Name','GPA'])
C = pd.merge(B, comb, on=['Name','GPA'])
D = pd.merge(C, comb, on=['Name','GPA'], suffixes=('_z', '_t'))
>>> D
Name GPA School_x Time_x School_y Time_y School_z Time_z School_t Time_t
0 A 4.0 U 22 U 22 U 22 U 22
1 B 3.8 U 26 U 26 U 26 U 26
2 C 3.7 U 30 U 30 U 30 U 30
3 D 3.5 S 34 S 34 S 34 S 34
4 E 3.7 S 44 S 44 S 44 S 44
5 F 3.6 S 54 S 54 S 54 S 54
6 G 3.7 C 76 C 76 C 76 C 76
7 H 3.5 C 86 C 86 C 86 C 86
8 I 3.7 C 96 C 96 C 96 C 96
>>>
pd.merge 文档:

Parameters:
...
...

suffixes: list-like, default is (“_x”, “_y”)

A length-2 sequence where each element is optionally a string indicating the suffix to add to overlapping column names in left and right respectively. Pass a value of None instead of a string to indicate that the column name from left or right should be left as-is, with no suffix. At least one of the values must not be None.

......



编辑:
对于该问题的最新更新,请尝试创建一个迭代器并使用 next .
functools.reduce 好多了:
from functools import reduce
from string import ascii_lowercase
it = iter(ascii_lowercase)
print(reduce(lambda x, y: pd.merge(x, y, on=['Name','GPA'], suffixes=('_' + next(it), '_' + next(it))), [comb for _ in range(4)]))
输出:
  Name  GPA School_a  Time_a School_b  Time_b School_e  Time_e School_f  Time_f
0 A 4.0 U 22 U 22 U 22 U 22
1 B 3.8 U 26 U 26 U 26 U 26
2 C 3.7 U 30 U 30 U 30 U 30
3 D 3.5 S 34 S 34 S 34 S 34
4 E 3.7 S 44 S 44 S 44 S 44
5 F 3.6 S 54 S 54 S 54 S 54
6 G 3.7 C 76 C 76 C 76 C 76
7 H 3.5 C 86 C 86 C 86 C 86
8 I 3.7 C 96 C 96 C 96 C 96
如您所见,我使用 [comb for _ in range(4)] 创建了一个列表推导式,这将循环并合并 4 次,要更改次数只需更改数字,即 [comb for _ in range(10)] .
对于函数:
from functools import reduce
from string import ascii_lowercase
def cumulative_merge(df, n):
it = iter(ascii_lowercase)
return reduce(lambda x, y: pd.merge(x, y, on=['Name','GPA'], suffixes=('_' + next(it), '_' + next(it))), [comb for _ in range(n)])
执行:
print(cumulative_merge(df, 4))
输出:
  Name  GPA School_a  Time_a School_b  Time_b School_e  Time_e School_f  Time_f
0 A 4.0 U 22 U 22 U 22 U 22
1 B 3.8 U 26 U 26 U 26 U 26
2 C 3.7 U 30 U 30 U 30 U 30
3 D 3.5 S 34 S 34 S 34 S 34
4 E 3.7 S 44 S 44 S 44 S 44
5 F 3.6 S 54 S 54 S 54 S 54
6 G 3.7 C 76 C 76 C 76 C 76
7 H 3.5 C 86 C 86 C 86 C 86
8 I 3.7 C 96 C 96 C 96 C 96

关于python - 用新的列名迭代合并 Pandas 列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69445027/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com