pd.read_csv("
Pd.read_csv(“
Actual code looks like;
实际代码如下所示;
import pandas as pd
作为PD进口大熊猫
df = pd.read_csv('https://www.pythonanywhere.com/user/elksie5000/files/home/elksie5000/girls_long_count.csv')
Error:
ParserError: Error tokenizing data. C error: Expected 1 fields in line 11, saw 2
Name Gender Year Measure Value first_letter
55512 A'Idah girl 2021 Count 0 A
55513 A'Isha girl 2021 Count 3 A
55514 A'Ishah girl 2021 Count 4 A
55515 A'Niyah girl 2021 Count 0 A
55516 Aa'Idah girl 2021 Count 4 A
55517 Aa'Isha girl 2021 Count 0 A
55518 Aa'Ishah girl 2021 Count 0 A
55519 Aabha girl 2021 Count 3 A
55520 Aabida girl 2021 Count 3 A
55521 Aabidah girl 2021 Count 0 A
55522 Aabish girl 2021 Count 0 A
55523 Aadhira girl 2021 Count 13 A
55524 Aadhiraa girl 2021 Count 3 A
55525 Aadhya girl 2021 Count 35 A
Update: I tried
更新:我试过了
import pandas as pd
作为PD进口大熊猫
df = pd.read_csv('https://www.pythonanywhere.com/user/elksie5000/files/home/elksie5000/girls_long_rank.csv, on_bad_lines='skip')
The result was
结果是
<!DOCTYPE html>
0 <html lang="en" style="height: 100%">
1 <head>
2 <!-- Google tag (gtag.js) -->
3 <script async src="https://www.goo...
4 <script>
... ...
151 </div>
The raw CSV looks like this:
原始CSV如下所示:
,Name,Gender,Year,Measure,Value,first_letter
55512,A'Idah,girl,2021,Count,0,A
55513,A'Isha,girl,2021,Count,3,A
55514,A'Ishah,girl,2021,Count,4,A
55515,A'Niyah,girl,2021,Count,0,A
55516,Aa'Idah,girl,2021,Count,4,A
55517,Aa'Isha,girl,2021,Count,0,A
55518,Aa'Ishah,girl,2021,Count,0,A
55519,Aabha,girl,2021,Count,3,A
55520,Aabida,girl,2021,Count,3,A
55521,Aabidah,girl,2021,Count,0,A
55522,Aabish,girl,2021,Count,0,A
55523,Aadhira,girl,2021,Count,13,A
55524,Aadhiraa,girl,2021,Count,3,A
55525,Aadhya,girl,2021,Count,35,A
55526,Aadila,girl,2021,Count,0,A
更多回答
The default delimiter for pd.read_csv
is a comma (","). The error Expected 1 fields in line 11, saw 2
suggests that your file has 10 lines without a comma. So, each line is parsed as 1 field, until we hit line 11, which does have a comma (2 fields) and failure ensues. You can try on_bad_lines='skip'
or on_bad_lines='warn'
to ignore the line, which at least should give you some output (if there is no other corruption). You may then find that the file actually uses a different delimiter. E.g., a tab (sep='\t'
) or a semi-colon (sep=';'
).
Pd.read_csv的默认分隔符是逗号(“,”)。错误要求在第11行有%1个字段,看到%2表明您的文件有10行,不带逗号。因此,每一行都被解析为1个字段,直到我们遇到第11行,它确实有一个逗号(2个字段),失败接踵而至。您可以尝试ON_BAD_LINES=‘SKIP’或ON_BAD_LINES=‘WARN’来忽略该行,这至少应该会给您一些输出(如果没有其他损坏的话)。然后,您可能会发现该文件实际上使用了不同的分隔符。例如制表符(sep=‘\t’)或分号(sep=‘;’)。
I tried that. My results are confusing to say the least I'll the in an update.
我试过了。我的结果令人困惑,至少可以说我会在更新中。
So, it's a wholly different issue: you're not reading any csv, you're just reading the html
from the site that you see when you try to access your url without being signed in. I.e., you'll need to add authentication somehow. Not sure if that is possible with pd.read_csv
. Maybe have a look here. Or maybe just download the file, and open it locally, if that's possible. Easiest for sure.
所以,这是一个完全不同的问题:你没有读取任何csv,你只是读取了你在没有登录的情况下试图访问你的网址时所看到的网站的html。即,你得加个身份验证不确定是否可以使用pd.read_csv。或许可以看看这里。或者下载文件,并在本地打开它,如果可能的话。最棒的。
The error message you're encountering is indicating that pd.read_csv
is expecting a single field per line but encountered two fields in line 11, causing the parsing error.
您遇到的错误消息表明,pd.read_csv要求每行只有一个字段,但在第11行遇到了两个字段,导致了分析错误。
The error occurs because the default delimiter for pd.read_csv
is a comma (,
), and your CSV file seems to use a different delimiter, possibly a tab (\t
) or a semicolon (;
).
出现该错误是因为pd.read_csv的默认分隔符是逗号(,),而您的CSV文件似乎使用了不同的分隔符,可能是制表符(\t)或分号(;)。
You can resolve this issue by specifying the correct delimiter when reading the CSV file. If it's a tab delimiter, you can use the sep
parameter like this:
您可以通过在读取CSV文件时指定正确的URL来解决此问题。如果它是一个制表符,你可以像这样使用sep参数:
import pandas as pd
# Specify the tab delimiter
df = pd.read_csv('https://www.pythonanywhere.com/user/elksie5000/files/home/elksie5000/girls_long_count.csv', sep='\t')
Or, if it's a semicolon delimiter:
或者,如果它是一个无尾龙:
import pandas as pd
# Specify the semicolon delimiter
df = pd.read_csv('https://www.pythonanywhere.com/user/elksie5000/files/home/elksie5000/girls_long_count.csv', sep=';')
This will correctly parse the CSV file with the specified delimiter, and you should not encounter the "ParserError" anymore.
这将正确地解析CSV文件与指定的错误,你应该不会遇到“ParserError”了。
更多回答
我是一名优秀的程序员,十分优秀!