- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我正在尝试从打开的在线 .csv 文件中检索数据: http://www.anp.gov.br/arquivos/acesso-informacao/dp/2020-producao-mar.csv
我正在使用 Anaconda + Spyder + Pandas。我使用的命令行是这样的:
FileList = ['http://www.anp.gov.br/arquivos/acesso-informacao/dp/2020-producao-mar.csv']
arq1 = FileList[0]
df1 = pd.read_csv(arq1, quotechar = '"')
Pandas 能够读取文件,但无法正确解析行。它无法解析的行是包含双引号内数据的行,例如:
'2020,01/2020,Bahia,Camamu,MANATI,7-MNT-3-BAS,Mar,PLATAFORMA DE MANATI 1,0,"241,729",0,"12257,70101","61,573",,,,,,,,'
我也试过这种方法:
file1 = pd.read_csv(arq1,sep=',\s*',skipinitialspace=True,quoting=csv.QUOTE_ALL,engine='python')
但是第二种方法会导致以下错误:
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 212: character maps to <undefined>
你能给我一些建议吗?
csv
文件中。Ano,Mês/Ano,Estado,Bacia,Campo,Poço,Ambiente,Instalação,Produção de Óleo (m³),Produção de Condensado (m³),Produção de Gás Associado (Mm³),Produção de Gás Não Associado (Mm³),Produção de Água (m³),Injeção de Gás (Mm³),Injeção de Água para Recuperação Secundária (m³),Injeção de Água para Descarte (m³),Injeção de Gás Carbônico (Mm³),Injeção de Nitrogênio (Mm³),Injeção de Vapor de Água (t),Injeção de Polímeros (m³),Injeção de Outros Fluidos (m³)
2020,01/2020,Alagoas,Alagoas,PARU,4-ALS-39-AL,Mar,Não Informado,0,0,0,0,0,,,,,,,,
"2020,01/2020,Bahia,Camamu,MANATI,7-MNT-1-BAS,Mar,PLATAFORMA DE MANATI 1,0,""265,58"",0,""17605,52003"",""74,489"",,,,,,,,"
"2020,01/2020,Bahia,Camamu,MANATI,7-MNT-2-BAS,Mar,PLATAFORMA DE MANATI 1,0,""326,366"",0,""17810,97775"",""84,152"",,,,,,,,"
"2020,01/2020,Bahia,Camamu,MANATI,7-MNT-3-BAS,Mar,PLATAFORMA DE MANATI 1,0,""241,729"",0,""12257,70101"",""61,573"",,,,,,,,"
"2020,01/2020,Bahia,Camamu,MANATI,7-MNT-4-BAS,Mar,PLATAFORMA DE MANATI 1,0,""285,911"",0,""17013,25742"",""88,015"",,,,,,,,"
"2020,01/2020,Bahia,Camamu,MANATI,7-MNT-5D-BAS,Mar,PLATAFORMA DE MANATI 1,0,""173,078"",0,""20459,1769"",""68,169"",,,,,,,,"
"2020,01/2020,Bahia,Camamu,MANATI,7-MNT-6D-BAS,Mar,PLATAFORMA DE MANATI 1,0,""178,857"",0,""24557,04732"",""75,546"",,,,,,,,"
"2020,01/2020,Bahia,Recôncavo,CANDEIAS,7-C-173D-BA,Mar,Estação Pedra Branca,""95,742"",0,""82,24558"",0,""0,194"",,,,,,,,"
2020,01/2020,Bahia,Recôncavo,CANDEIAS,7-C-174D-BA,Mar,Não Informado,0,0,0,0,0,,,,,,,,
2020,01/2020,Bahia,Recôncavo,CANDEIAS,7-C-197D-BA,Mar,Não Informado,0,0,0,0,0,,,,,,,,
2020,01/2020,Bahia,Recôncavo,CANDEIAS,7-C-201D-BA,Mar,Não Informado,0,0,0,0,0,,,,,,,,
2020,01/2020,Bahia,Recôncavo,CANDEIAS,7-C-202D-BA,Mar,Não Informado,0,0,0,0,0,,,,,,,,
2020,01/2020,Bahia,Recôncavo,CANDEIAS,7-C-203D-BA,Mar,Não Informado,0,0,0,0,0,,,,,,,,
2020,01/2020,Bahia,Recôncavo,CANDEIAS,7-C-211D-BA,Mar,Não Informado,0,0,0,0,0,,,,,,,,
2020,01/2020,Bahia,Recôncavo,CANDEIAS,7-C-212D-BA,Mar,Não Informado,0,0,0,0,0,,,,,,,,
"2020,01/2020,Bahia,Recôncavo,DOM JOÃO,7-DJM-854H-BAS,Mar,Estação Marapé (Dom João Mar),""388,00158"",0,""3,10388"",0,""3221,81179"",,,,,,,,"
"2020,01/2020,Bahia,Recôncavo,DOM JOÃO,7-DJM-856H-BAS,Mar,Estação Marapé (Dom João Mar),""318,49041"",0,""2,54778"",0,""4814,03179"",,,,,,,,"
"2020,01/2020,Bahia,Recôncavo,DOM JOÃO,7-DJM-857H-BAS,Mar,Estação Marapé (Dom João Mar),""149,19484"",0,""1,19341"",0,""2641,14209"",,,,,,,,"
"2020,01/2020,Bahia,Recôncavo,DOM JOÃO,8-DJ-811H-BAS,Mar,Estação Marapé (Dom João Mar),,,,,,0,""5816,23328"",0,0,0,0,0,0"
"2020,01/2020,Bahia,Recôncavo,DOM JOÃO,8-DJM-858H-BAS,Mar,Estação Marapé (Dom João Mar),,,,,,0,""5396,07916"",0,0,0,0,0,0"
"2020,01/2020,Bahia,Recôncavo,DOM JOÃO,8-DJM-881H-BAS,Mar,Estação Marapé (Dom João Mar),""196,46254"",0,""1,57155"",0,""2268,57935"",,,,,,,,"
"2020,01/2020,Bahia,Recôncavo,DOM JOÃO MAR,7-DJM-854H-BAS,Mar,Estação Marapé (Dom João Mar),""56,69942"",0,""0,45345"",0,""470,80921"",,,,,,,,"
"2020,01/2020,Bahia,Recôncavo,DOM JOÃO MAR,7-DJM-856H-BAS,Mar,Estação Marapé (Dom João Mar),""46,54159"",0,""0,37222"",0,""703,48321"",,,,,,,,"
"2020,01/2020,Bahia,Recôncavo,DOM JOÃO MAR,7-DJM-857H-BAS,Mar,Estação Marapé (Dom João Mar),""21,80216"",0,""0,17426"",0,""385,95491"",,,,,,,,"
2020,01/2020,Bahia,Recôncavo,DOM JOÃO MAR,7-DJM-882H-BAS,Mar,Não Informado,0,0,0,0,0,,,,,,,,
"2020,01/2020,Bahia,Recôncavo,DOM JOÃO MAR,8-DJ-811H-BAS,Mar,Estação Marapé (Dom João Mar),,,,,,0,""849,93672"",0,0,0,0,0,0"
"2020,01/2020,Bahia,Recôncavo,DOM JOÃO MAR,8-DJM-858H-BAS,Mar,Estação Marapé (Dom João Mar),,,,,,0,""788,53884"",0,0,0,0,0,0"
"2020,01/2020,Bahia,Recôncavo,DOM JOÃO MAR,8-DJM-881H-BAS,Mar,Estação Marapé (Dom João Mar),""28,70946"",0,""0,22956"",0,""331,51165"",,,,,,,,"
2020,01/2020,Ceará,Ceará,ATUM,3-AT-8-CES,Mar,Não Informado,0,0,0,0,0,,,,,,,,
2020,01/2020,Ceará,Ceará,ATUM,3-CES-83-CE,Mar,Não Informado,0,0,0,0,0,,,,,,,,
2020,01/2020,Ceará,Ceará,ATUM,3-CES-86D-CE,Mar,Não Informado,0,0,0,0,0,,,,,,,,
2020,01/2020,Ceará,Ceará,ATUM,7-AT-10D-CES,Mar,Não Informado,0,0,0,0,0,,,,,,,,
2020,01/2020,Ceará,Ceará,ATUM,7-AT-13D-CES,Mar,Não Informado,0,0,0,0,0,,,,,,,,
2020,01/2020,Ceará,Ceará,ATUM,7-AT-16D-CES,Mar,Não Informado,0,0,0,0,0,,,,,,,,
2020,01/2020,Ceará,Ceará,ATUM,7-AT-17D-CES,Mar,Não Informado,0,0,0,0,0,,,,,,,,
2020,01/2020,Ceará,Ceará,ATUM,7-AT-18D-CES,Mar,Não Informado,0,0,0,0,0,,,,,,,,
2020,01/2020,Ceará,Ceará,ATUM,7-AT-19D-CES,Mar,Não Informado,0,0,0,0,0,,,,,,,,
2020,01/2020,Ceará,Ceará,ATUM,7-AT-21DP-CES,Mar,Não Informado,0,0,0,0,0,,,,,,,,
"2020,01/2020,Ceará,Ceará,ATUM,7-AT-22DP-CES,Mar,PLATAFORMA DE ATUM 2,""328,927"",0,""23,30745"",0,""761,164"",,,,,,,,"
最佳答案
"..."
'"'
已从其他 10735 行中删除,一些值在 2 组双引号中(例如 ""...""
),已修复,因为它会导致行长度不均匀的问题。decimal=','
以便将数字正确解析为 float
类型。import pandas as pd
# file location
file = 'e:/PythonProjects/stack_overflow/data/2020-producao-mar.csv'
# open the original file to read and open a file to write to
with open(file, mode='r', encoding='utf-8') as f, open('cleaned.csv', mode='w', encoding='utf-8') as f1:
# read the lines
lines = f.readlines()
# parse each line
for line in lines:
# remove newlines from the end of the rows
line = line.strip()
# find rows with beginning and ending quotes
if (line.startswith('"') == True) and (line.endswith('"') == True):
# extract the string between the beginning and end quote
line = line[1:-1]
# replace 2 double quotes for 1 double quote
line = line.replace('""', '"')
# write the rows to cleaned.csv
f1.writelines(line + '\n')
# create a dataframe with original.csv
df1 = pd.read_csv('cleaned.csv', encoding='utf-8', quotechar='"', decimal=',')
# display(df1.head())
Ano Mês/Ano Estado Bacia Campo Poço Ambiente Instalação Produção de Óleo (m³) Produção de Condensado (m³) Produção de Gás Associado (Mm³) Produção de Gás Não Associado (Mm³) Produção de Água (m³) Injeção de Gás (Mm³) Injeção de Água para Recuperação Secundária (m³) Injeção de Água para Descarte (m³) Injeção de Gás Carbônico (Mm³) Injeção de Nitrogênio (Mm³) Injeção de Vapor de Água (t) Injeção de Polímeros (m³) Injeção de Outros Fluidos (m³)
0 2020 01/2020 Alagoas Alagoas PARU 4-ALS-39-AL Mar Não Informado 0.0 0.000 0.0 0.00000 0.000 NaN NaN NaN NaN NaN NaN NaN NaN
1 2020 01/2020 Bahia Camamu MANATI 7-MNT-1-BAS Mar PLATAFORMA DE MANATI 1 0.0 265.580 0.0 17605.52003 74.489 NaN NaN NaN NaN NaN NaN NaN NaN
2 2020 01/2020 Bahia Camamu MANATI 7-MNT-2-BAS Mar PLATAFORMA DE MANATI 1 0.0 326.366 0.0 17810.97775 84.152 NaN NaN NaN NaN NaN NaN NaN NaN
3 2020 01/2020 Bahia Camamu MANATI 7-MNT-3-BAS Mar PLATAFORMA DE MANATI 1 0.0 241.729 0.0 12257.70101 61.573 NaN NaN NaN NaN NaN NaN NaN NaN
4 2020 01/2020 Bahia Camamu MANATI 7-MNT-4-BAS Mar PLATAFORMA DE MANATI 1 0.0 285.911 0.0 17013.25742 88.015 NaN NaN NaN NaN NaN NaN NaN NaN
关于python - csv 格式的数据,用双引号括起来,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64880047/
初学者 android 问题。好的,我已经成功写入文件。例如。 //获取文件名 String filename = getResources().getString(R.string.filename
我已经将相同的图像保存到/data/data/mypackage/img/中,现在我想显示这个全屏,我曾尝试使用 ACTION_VIEW 来显示 android 标准程序,但它不是从/data/dat
我正在使用Xcode 9,Swift 4。 我正在尝试使用以下代码从URL在ImageView中显示图像: func getImageFromUrl(sourceUrl: String) -> UII
我的 Ubuntu 安装 genymotion 有问题。主要是我无法调试我的数据库,因为通过 eclipse 中的 DBMS 和 shell 中的 adb 我无法查看/data/文件夹的内容。没有显示
我正在尝试用 PHP 发布一些 JSON 数据。但是出了点问题。 这是我的 html -- {% for x in sets %}
我观察到两种方法的结果不同。为什么是这样?我知道 lm 上发生了什么,但无法弄清楚 tslm 上发生了什么。 > library(forecast) > set.seed(2) > tts lm(t
我不确定为什么会这样!我有一个由 spring data elasticsearch 和 spring data jpa 使用的类,但是当我尝试运行我的应用程序时出现错误。 Error creatin
在 this vega 图表,如果我下载并转换 flare-dependencies.json使用以下 jq 到 csv命令, jq -r '(map(keys) | add | unique) as
我正在提交一个项目,我必须在其中创建一个带有表的 mysql 数据库。一切都在我这边进行,所以我只想检查如何将我所有的压缩文件发送给使用不同计算机的人。基本上,我如何为另一台计算机创建我的数据库文件,
我有一个应用程序可以将文本文件写入内部存储。我想仔细看看我的电脑。 我运行了 Toast.makeText 来显示路径,它说:/数据/数据/我的包 但是当我转到 Android Studio 的 An
我喜欢使用 Genymotion 模拟器以如此出色的速度加载 Android。它有非常好的速度,但仍然有一些不稳定的性能。 如何从 Eclipse 中的文件资源管理器访问 Genymotion 模拟器
我需要更改 Silverlight 中文本框的格式。数据通过 MVVM 绑定(bind)。 例如,有一个 int 属性,我将 1 添加到 setter 中的值并调用 OnPropertyChanged
我想向 Youtube Data API 提出请求,但我不需要访问任何用户信息。我只想浏览公共(public)视频并根据搜索词显示视频。 我可以在未经授权的情况下这样做吗? 最佳答案 YouTube
我已经设置了一个 Twilio 应用程序,我想向人们发送更新,但我不想回复单个文本。我只是想让他们在有问题时打电话。我一切正常,但我想在发送文本时显示传入文本,以确保我不会错过任何问题。我正在使用 p
我有一个带有表单的网站(目前它是纯 HTML,但我们正在切换到 JQuery)。流程是这样的: 接受用户的输入 --- 5 个整数 通过 REST 调用网络服务 在服务器端运行一些计算...并生成一个
假设我们有一个名为 configuration.js 的文件,当我们查看内部时,我们会看到: 'use strict'; var profile = { "project": "%Projec
这部分是对 Previous Question 的扩展我的: 我现在可以从我的 CI Controller 成功返回 JSON 数据,它返回: {"results":[{"id":"1","Sourc
有什么有效的方法可以删除 ios 中 CBL 的所有文档存储?我对此有疑问,或者,如果有人知道如何从本质上使该应用程序像刚刚安装一样,那也会非常有帮助。我们正在努力确保我们的注销实际上将应用程序设置为
我有一个 Rails 应用程序,它与其他 Rails 应用程序通信以进行数据插入。我使用 jQuery $.post 方法进行数据插入。对于插入,我的其他 Rails 应用程序显示 200 OK。但在
我正在为服务于发布请求的 API 调用运行单元测试。我正在传递请求正文,并且必须将响应作为帐户数据返回。但我只收到断言错误 注意:数据是从 Azure 中获取的 spec.js const accou
我是一名优秀的程序员,十分优秀!