gpt4 book ai didi

python-3.x - 清理后从URL直接加载csv数据到mysql表

转载 作者:行者123 更新时间:2023-12-04 09:04:38 25 4
gpt4 key购买 nike

我想将 URL“https://www.treasury.gov/ofac/downloads/sdn.csv”中给出的数据直接加载到名为 sdn 的表中。
我想要做的唯一更改是将所有具有此值的列的所有 '-0- ' 替换为 ''。
我尝试使用 Pandas 来做到这一点,但我的方法看起来并不干净。

import requests
import pandas as pd


sdnURL = "https://www.treasury.gov/ofac/downloads/sdn.csv"
altURL = "https://www.treasury.gov/ofac/downloads/alt.csv"
addURL = "https://www.treasury.gov/ofac/downloads/add.csv"
sdnCommentsURL = "https://www.treasury.gov/ofac/downloads/sdn_comments.csv"

sdnHeader = ["sdn_id", "sdn_name", "sdn_type", "program", "title", "call_sign", "vessel_type", "tonnage", "gross_tonnage", "vessel_flag", "vessel_owner", "remarks"]
altHeader = ["sdn_id", "alt_id", "alt_type", "alt_name", "remarks"]
addHeader = ["sdn_id", "address_id", "address", "city_state_post", "country", "remarks"]
sdnCommentsHeader = ["sdn_id", "remarks"]


sdn = pd.read_csv(sdnURL, names = sdnHeader, header = None)
alt = pd.read_csv(altURL, names = altHeader, header = None)
add = pd.read_csv(addURL, names = addHeader, header = None)
sdnComments = pd.read_csv(sdnCommentsURL, names = sdnCommentsHeader, header = None)

sdn.to_csv('sdn.csv', index = False)
alt.to_csv('alt.csv', index = False)
add.to_csv('add.csv', index = False)
sdnComments.to_csv('sdnComments.csv', index = False)
此外,我计划将 csv 加载到 mysql 表。
我的方法有两个问题-
  • 我不想为每个文件编写命令。
  • 一次性替换所有列中的“-0-”

  • 最终编辑:感谢@Jimmar 的回答,我终于写出了这样的代码 -
    import requests
    import pandas as pd

    files = {
    "sdn" : ["sdn_id", "sdn_name", "sdn_type", "program", "title", "call_sign", "vessel_type", "tonnage", "gross_tonnage", "vessel_flag", "vessel_owner", "remarks"],
    "alt" : ["sdn_id", "alt_id", "alt_type", "alt_name", "remarks"],
    "add" : ["sdn_id", "address_id", "address", "city_state_post", "country", "remarks"],
    "sdn_comments" : ["sdn_id", "remarks"]
    }

    def fetch_csv(file, headers):
    df = pd.read_csv("https://www.treasury.gov/ofac/downloads/"+file+".csv", names=headers, header=None)
    df = df.replace('-0- ', '')
    df.to_csv(file+'.csv', index=False)

    for file, headers in files.items():
    fetch_csv(file, headers)

    最佳答案

    你可以用这种方式组织你的代码(我只会做 2)

    import requests
    import pandas as pd

    def fetch_csv(url, headers, file_name):
    df = pd.read_csv(url, names=headers, header=None)
    df = df.replace('-0- ', '')
    df.to_csv(file_name, index=False)

    sources = [
    {
    "url": "https://www.treasury.gov/ofac/downloads/sdn.csv",
    "headers": ["sdn_id", "sdn_name", "sdn_type", "program", "title", "call_sign", "vessel_type", "tonnage", "gross_tonnage", "vessel_flag", "vessel_owner", "remarks"],
    "file_name": "sdn.csv"
    },
    {
    "url": "https://www.treasury.gov/ofac/downloads/alt.csv",
    "headers": ["sdn_id", "alt_id", "alt_type", "alt_name", "remarks"],
    "file_name": "alt.csv"
    } # add the rest in the same pattern
    ]

    for source in sources:
    fetch_csv(source['url'], source['headers'], source['file_name'])
    如果需要将其写入数据库,则应替换 df.to_csv线与 to_sql

    关于python-3.x - 清理后从URL直接加载csv数据到mysql表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63476929/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com