gpt4 book ai didi

python - 在 Pandas 数据框中拆分列

转载 作者:太空宇宙 更新时间:2023-11-04 07:28:13 24 4
gpt4 key购买 nike

我想使用逗号分隔符将 df 中的 ji 列拆分为两列 - 去掉 周围的括号也很好ji 值。我尝试了各种方法并不断出错。我想暂时避免使用 lambda 表达式!还有其他想法吗?

例子

      ji           length
0 (75.0, 5.0) 3283.458479
1 (96.0, 5.0) 1431.312901
2 (97.0, 5.0) 1364.592959
3 (247.0, 5.0) 3736.322308
4 (81.0, 7.0) 2655.910005
5 (93.0, 7.0) 1752.293687
6 (242.0, 7.0) 427.844417
7 (248.0, 7.0) 3725.823013
8 (254.0, 7.0) 2318.937332
9 (255.0, 7.0) 2292.673905
10 (242.0, 8.0) 145.811907
11 (254.0, 8.0) 2222.447786
12 (255.0, 8.0) 2196.184360
13 (248.0, 9.0) 441.222866
14 (253.0, 9.0) 853.095032
15 (256.0, 9.0) 2076.942682
16 (91.0, 10.0) 1743.310744
17 (93.0, 10.0) 1256.337420
18 (105.0, 10.0) 523.447658
19 (174.0, 10.0) 1530.617012
20 (176.0, 10.0) 1697.614009
21 (248.0, 10.0) 440.000463
22 (253.0, 10.0) 904.706003
23 (256.0, 10.0) 1991.662604
24 (258.0, 10.0) 1850.995862
25 (172.0, 11.0) 1301.179960
26 (174.0, 11.0) 1436.984094
27 (176.0, 11.0) 1695.954099
28 (179.0, 11.0) 1548.015013
29 (228.0, 11.0) 4640.928585
30 (242.0, 11.0) 169.617203
31 (251.0, 11.0) 784.921333
32 (253.0, 11.0) 983.118859
33 (255.0, 11.0) 1181.474433
34 (256.0, 11.0) 1303.398235

您可以使用以下方式加载上面的示例:

import pandas as pd
from io import StringIO

csv = """\
ji:length
(75.0,5.0):3283.458479
(96.0,5.0):1431.312901
(97.0,5.0):1364.592959
(247.0,5.0):3736.322308
(81.0,7.0):2655.910005
(93.0,7.0):1752.293687
(242.0,7.0):427.844417
(248.0,7.0):3725.823013
(254.0,7.0):2318.937332
(255.0,7.0):2292.673905
(242.0,8.0):145.811907
(254.0,8.0):2222.447786
(255.0,8.0):2196.184360
(248.0,9.0):441.222866
(253.0,9.0):853.095032
(256.0,9.0):2076.942682
(91.0,10.0):1743.310744
(93.0,10.0):1256.337420
(105.0,10.0):523.447658
(174.0,10.0):1530.617012
(176.0,10.0):1697.614009
(248.0,10.0):440.000463
(253.0,10.0):904.706003
(256.0,10.0):1991.662604
(258.0,10.0):1850.995862
(172.0,11.0):1301.179960
(174.0,11.0):1436.984094
(176.0,11.0):1695.954099
(179.0,11.0):1548.015013
(228.0,11.0):4640.928585
(242.0,11.0):169.617203
(251.0,11.0):784.921333
(253.0,11.0):983.118859
(255.0,11.0):1181.474433
(256.0,11.0):1303.398235
"""
df = pd.read_csv(StringIO(csv), sep=":")

最佳答案

如果 ji 列中的字符串的解决方案 - pop提取柱,stripsplitDataFrame 使用 expand=True:

print (type(df.loc[0, 'ji']))
<class 'str'>

df[['a','b']] = df.pop('ji').str.strip('()').str.split(', ', expand=True).astype(float)

或者如果没有缺失值并且性能很重要,则使用列表理解:

L = [x.strip('()').split(', ') for x in df.pop('ji')]
df[['a','b']] = pd.DataFrame(L, index=df.index).astype(float)

print (df)
length a b
0 3283.458479 75.0 5.0
1 1431.312901 96.0 5.0
2 1364.592959 97.0 5.0
3 3736.322308 247.0 5.0
4 2655.910005 81.0 7.0
5 1752.293687 93.0 7.0
6 427.844417 242.0 7.0
7 3725.823013 248.0 7.0

If tuples 然后创建嵌套的元组列表并传递给 DataFrame 构造函数:

print (type(df.loc[0, 'ji']))
<class 'tuple'>

df[['a','b']] = pd.DataFrame(df.pop('ji').values.tolist(), index=df.index)

关于python - 在 Pandas 数据框中拆分列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53632139/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com