gpt4 book ai didi

python - 根据分隔符将数据帧列拆分为两列

转载 作者:太空宇宙 更新时间:2023-11-03 14:46:55 25 4
gpt4 key购买 nike

我正在预处理文本以进行分类,并像这样导入数据集:

dataset = pd.read_csv('lyrics.csv', delimiter = '\t', quoting = 2)

数据集在终端上打印:

                                 lyrics,classification
0 I should have known better with a girl like yo...
1 You can shake an apple off an apple tree\nShak...
2 It's been a hard day's night\nAnd I've been wo...
3 Michelle, ma belle\nThese are words that go to...

但是,当我使用 spyder 更仔细地检查变量 dataset 时,我发现我只有一列,而不是所需的两列。

enter image description here

考虑到歌词本身有逗号和“,”分隔符不起作用,

如何更正上面的数据框以便:

1) 一栏歌词

2) 一列用于分类

每行都有相应的数据?

最佳答案

如果您的歌词本身不包含逗号(很可能包含),那么您可以将 read_csvdelimiter=',' 结合使用。

但是,如果这不是一个选项,您可以使用 str.rsplit:

dataset.iloc[:, 0].str.rsplit(',', expand=True)
<小时/>
df

lyrics,classification
0 I should have known better with a girl like yo...
1 You can shake an...,0
2 It's been a hard day's night...,0

df = df.iloc[:, 0].str.rsplit(',', 1, expand=True)
df.columns = ['lyrics', 'classification']
df

lyrics classification
0 I should have known better with a girl like yo... 0
1 You can shake an... 0
2 It's been a hard day's night... 0

关于python - 根据分隔符将数据帧列拆分为两列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46165775/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com