gpt4 book ai didi

python 3无法识别竖线字符

转载 作者:太空宇宙 更新时间:2023-11-03 14:32:33 24 4
gpt4 key购买 nike

我有以下代码,但 python 3 无法将垂直管道识别为 unicode 字符。

    m_cols = ['movie_id', 'title', 'release_date', 
'video_release_date', 'imdb_url']

movies = pd.read_csv(
'http://files.grouplens.org/datasets/movielens/ml-100k/u.item',
sep='|', names=m_cols, usecols=range(5))

movies.head()

我收到以下错误

    UnicodeDecodeError                        Traceback (most recent call 
last)
pandas\_libs\parsers.pyx in
pandas._libs.parsers.TextReader._convert_tokens
(pandas\_libs\parsers.c:14858)()

pandas\_libs\parsers.pyx in
pandas._libs.parsers.TextReader._convert_with_dtype
(pandas\_libs\parsers.c:17119)()

pandas\_libs\parsers.pyx in
pandas._libs.parsers.TextReader._string_convert
(pandas\_libs\parsers.c:17347)()

pandas\_libs\parsers.pyx in pandas._libs.parsers._string_box_utf8
(pandas\_libs\parsers.c:23041)()

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 3:
invalid continuation byte

During handling of the above exception, another exception occurred:

UnicodeDecodeError Traceback (most recent call
last)
<ipython-input-15-72a8222212c1> in <module>()
4 movies = pd.read_csv(
5 'http://files.grouplens.org/datasets/movielens/ml-100k/u.item',
----> 6 sep='|', names=m_cols, usecols=range(5))
7
8 movies.head()

这背后可能的原因是什么?我该如何解决这个问题?

最佳答案

在python3中,使用encoding="latin-1":

In [9]: movies = pd.read_csv(
'http://files.grouplens.org/datasets/movielens/ml-100k/u.item',
sep='|', names=m_cols, usecols=range(5), header=None, encoding="latin-1")

In [10]: movies.head()
Out[10]:
movie_id title release_date video_release_date \
0 1 Toy Story (1995) 01-Jan-1995 NaN
1 2 GoldenEye (1995) 01-Jan-1995 NaN
2 3 Four Rooms (1995) 01-Jan-1995 NaN
3 4 Get Shorty (1995) 01-Jan-1995 NaN
4 5 Copycat (1995) 01-Jan-1995 NaN

imdb_url
0 http://us.imdb.com/M/title-exact?Toy%20Story%2...
1 http://us.imdb.com/M/title-exact?GoldenEye%20(...
2 http://us.imdb.com/M/title-exact?Four%20Rooms%...
3 http://us.imdb.com/M/title-exact?Get%20Shorty%...
4 http://us.imdb.com/M/title-exact?Copycat%20(1995)

关于python 3无法识别竖线字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47180499/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com