gpt4 book ai didi

python - 使用 pandas 将分割数据集保存为 .txt 格式

转载 作者:行者123 更新时间:2023-12-01 01:22:25 28 4
gpt4 key购买 nike

尝试将数据集输出到traintest,然后需要将其保存为.txt格式。

这是到目前为止的代码,

import pandas as pd
from sklearn.model_selection import train_test_split

category=pd.read_csv('dataset.tsv',delimiter='\t',encoding='utf-8')

train, test = train_test_split(category, test_size=0.2)

test.to_csv('checkme.txt')

但是,当我尝试这样做时,它给出了错误:

Traceback (most recent call last): File "splitter.py", line 8, in test.to_csv('checkme.tsv') File "/home/abc/micro/micro/local/lib/python2.7/site-packages/pandas/core/frame.py", line 1745, in to_csv formatter.save() File "/home/abc/micro/micro/local/lib/python2.7/site-packages/pandas/io/formats/csvs.py", line 171, in save self._save() File "/home/abc/micro/micro/local/lib/python2.7/site-packages/pandas/io/formats/csvs.py", line 286, in _save self._save_chunk(start_i, end_i) File "/home/abc/micro/micro/local/lib/python2.7/site-packages/pandas/io/formats/csvs.py", line 313, in _save_chunk self.cols, self.writer) File "pandas/_libs/writers.pyx", line 64, in pandas._libs.writers.write_csv_rows UnicodeEncodeError: 'ascii' codec can't encode character u'\u026a' in position 111: ordinal not in range(128)

这里可能出了什么问题,如何解决这个问题?

最佳答案

您需要将数据帧编写为 unicode:


test.to_csv('checkme.txt', sep='\t', encoding='utf-8')

关于python - 使用 pandas 将分割数据集保存为 .txt 格式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53689758/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com