gpt4 book ai didi

python - 将函数应用于数据框列

转载 作者:行者123 更新时间:2023-12-01 04:01:26 25 4
gpt4 key购买 nike

我有一个 pandas 数据框:

  name    sample
1 a Category 1: qwe, asd (line break) Category 2: sdf, erg
2 b Category 2: sdf, erg(line break) Category 5: zxc, eru
...
30 p Category 1: asd, Category PE: 2134, EFDgh, Pdr tke, err

我想结束:

 name    qwe   asd   sdf   erg   zxc   eru 2134  EFDgh  Pdr tke  err
1 a 1 1 1 1 0 0 0 0 0 0
2 b 0 0 1 1 1 1 0 0 0 0
...
30 p 0 1 0 0 0 0 0 1 1 0

我创建了以下函数:

def cleanattributes(istring):

istring=str(istring)
istring=istring.rstrip().split('\\n')

counter=0
for line in istring:
istring[counter]=istring[counter].rpartition(': ')[-1]
counter+=1
istring=str(istring)
istring = istring.replace("'", "")
istring = istring.replace("\"", "")
return(str(istring))

此函数创建返回不带类别标题的类别信息的预期结果(想法是使用 getdummies 获取列)

teststring="Category 1: qwe, asd\\nCategory 2: sdf, erg"
cleanattributes(teststring)
OUTPUT: '[qwe, asd, sdf, erg]'

我不确定如何最好地将此函数应用于每个记录,以便数据框如下所示:

  name    sample
1 a qwe, asd, sdf, erg
2 b sdf, erg, zxc, eru
...
30 p asd, 2134, EFDgh, Pdr tke, err

或者这是否是解决此问题的最佳方法。

根据要求:

df['sample'].iat[0]
OUTPUt= 'Category 1: qwe, asd\nCategory 2: sdf, erg'

最佳答案

df = pd.DataFrame(
{'name': ['a', 'b'],
'sample': ['Category 1: asd, Category PE: 2134, EFDgh, Pdr tke, err',
'Category 2: sdf, erg\nCategory 5: zxc, eru\nCategory 1: asd, Category PE: 2134, EFDgh, Pdr tke, err']}

df2 = pd.concat([df.name,
df['sample']
.str.replace("(Category .*: )+", '') # Remove "Category [*]:"
.str.replace(r'\n', '') # Remove "\n"
.str.split(', ', expand=True)],
axis=1)

df3 = pd.melt(df2, id_vars='name')[['name', 'value']]

>>> pd.concat([df3['name'], pd.get_dummies(df3['value'])], axis=1)
name 2134 EFDgh Pdr tke ergzxc err eru2134 sdf
0 a 1 0 0 0 0 0 0
1 b 0 0 0 0 0 0 1
2 a 0 1 0 0 0 0 0
3 b 0 0 0 1 0 0 0
4 a 0 0 1 0 0 0 0
5 b 0 0 0 0 0 1 0
6 a 0 0 0 0 1 0 0
7 b 0 1 0 0 0 0 0
8 a 0 0 0 0 0 0 0
9 b 0 0 1 0 0 0 0
10 a 0 0 0 0 0 0 0
11 b 0 0 0 0 1 0 0

关于python - 将函数应用于数据框列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36435916/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com