gpt4 book ai didi

python csv处理列内的逗号

转载 作者:行者123 更新时间:2023-12-01 09:28:24 25 4
gpt4 key购买 nike

处理含有小说文本数据的csv文件。

book_id, title, content
1, book title 1, All Passion Spent is written in three parts, primarily from the view of an intimate observer.
2, Book Title 2, In particular Mr FitzGeorge, a forgotten acquaintance from India who has ever since been in love with her, introduces himself and they form a quiet but playful and understanding friendship. It cost 3,4234 to travel.

内容列中的文本有逗号,不幸的是,当您尝试使用 pandas.read_csv 时,您会收到 pandas.errors.ParserError: Error tokenizing data。 C 错误:

这个问题有一些解决方案,但没有一个有效。尝试作为常规文件读取然后传递到数据帧失败。 SO - Solution

最佳答案

您可以尝试读取文件,然后使用 str.split(",", 2) 分割内容,然后将结果转换为 DF。

例如:

import pandas as pd
content = []
with open(filename, "r") as infile:
header = infile.readline().strip().split(",")
content = [i.strip().split(",", 2) for i in infile.readlines()]

df = pd.DataFrame(content, columns=header)
print(df)

输出:

  book_id          title                                            content
0 1 book title 1 All Passion Spent is written in three parts, ...
1 2 Book Title 2 In particular Mr FitzGeorge, a forgotten acq...

关于python csv处理列内的逗号,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50158920/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com