gpt4 book ai didi

python - 将 pandas 数据框按列分成两部分

转载 作者:行者123 更新时间:2023-12-01 03:39:29 24 4
gpt4 key购买 nike

我有一个数据帧,我想将其拆分为两个数据帧,一个包含以 foo 开头的所有列,另一个包含其余列。有没有快速的方法来做到这一点?

最佳答案

您可以使用列表推导式来选择所有列名称:

df = pd.DataFrame({'fooA':[1,2,3],
'fooB':[4,5,6],
'fooC':[7,8,9],
'D':[1,3,5],
'E':[5,3,6],
'F':[7,4,3]})

print (df)
D E F fooA fooB fooC
0 1 5 7 1 4 7
1 3 3 4 2 5 8
2 5 6 3 3 6 9

foo = [col for col in df.columns if col.startswith('foo')]
print (foo)
['fooA', 'fooB', 'fooC']

other = [col for col in df.columns if not col.startswith('foo')]
print (other)
['D', 'E', 'F']

print (df[foo])
fooA fooB fooC
0 1 4 7
1 2 5 8
2 3 6 9

print (df[other])
D E F
0 1 5 7
1 3 3 4
2 5 6 3
<小时/>

另一个解决方案 filterdifference :

df1 = df.filter(regex='^foo')
print (df1)
fooA fooB fooC
0 1 4 7
1 2 5 8
2 3 6 9

print (df.columns.difference(df1.columns))
Index(['D', 'E', 'F'], dtype='object')

print (df[df.columns.difference(df1.columns)])
D E F
0 1 5 7
1 3 3 4
2 5 6 3

时间:

In [123]: %timeit a(df)
1000 loops, best of 3: 1.06 ms per loop

In [124]: %timeit b(df3)
1000 loops, best of 3: 1.04 ms per loop

In [125]: %timeit c(df4)
1000 loops, best of 3: 1.41 ms per loop
df3 = df.copy()
df4 = df.copy()

def a(df):
df1 = df.filter(regex='^foo')
df2 = df[df.columns.difference(df1.columns)]
return df1, df2

def b(df):
df1 = df[[col for col in df.columns if col.startswith('foo')]]
df2 = df[[col for col in df.columns if not col.startswith('foo')]]
return df1, df2

def c(df):
df1 = df[df.columns[df.columns.str.startswith('foo')]]
df2 = df[df.columns[~df.columns.str.startswith('foo')]]
return df1, df2

df1, df2 = a(df)
print (df1)
print (df2)

df1, df2 = b(df3)
print (df1)
print (df2)

df1, df2 = c(df4)
print (df1)
print (df2)

关于python - 将 pandas 数据框按列分成两部分,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39882767/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com