gpt4 book ai didi

python - 来自迭代多个 URL 的 JSON 响应以存储 DataFrame

转载 作者:太空宇宙 更新时间:2023-11-04 04:18:22 25 4
gpt4 key购买 nike

我有动态 API URL,每个 URL 都以 JSON 形式获取响应数据,如下所示。

{
"@type":"connection",
"id":"001ZOZ0B00000000006Z",
"orgId":"001ZOZ",
"name":"WWW3",
"description":"Test connection2",
"createTime":"2018-07-20T18:28:05.000Z",
"updateTime":"2018-07-20T18:28:53.000Z",
"createdBy":"xx.xx@xx.com.dev",
"updatedBy":"xx.xx@xx.com.dev",
"agentId":"001ZOZ08000000000007",
"runtimeEnvironmentId":"001ZOZ25000000000007",
"instanceName":"ShareConsumer",
"shortDescription":"Test connection2",
"type":"TOOLKIT",
"port":0,
"majorUpdateTime":"2018-07-20T18:28:05.000Z",
"timeout":60,
"connParams":{
"WSDL URL":"https://xxxservices1.work.com/xxx/service/xxport2/n5/Integration%20System/API__Data?wsdl",
"Must Understand":"true",
"DOMAIN":"n5",
"agentId":"001ZOZ0800XXX0007",
"agentGroupId":"001ZOZ25000XXX0007",
"AUTHENTICATION_TYPE":"Auto",
"HTTP Password":"********",
"Encrypt password":"false",
"orgId":"001Z9Z",
"PRIVATE_KEY_FILE":"",
"KEY_FILE_TYPE":"PEM",
"mode":"UPDATE",
"CERTIFICATE_FILE_PASSWORD":null,
"CERTIFICATE_FILE":null,
"TRUST_CERTIFICATES_FILE":null,
"Username":"xxx@xxx",
"CERTIFICATE_FILE_TYPE":"PEM",
"KEY_PASSWORD":null,
"TIMEOUT":"60",
"Endpoint URL":"https://wxxservices1.xx.com/xxx/service/xxport2/n5/Integration%20System/API__Data",
"connectionTypes":"NOAUTH",
"HTTP Username":"API@n5",
"Password":"********"
}
}

现在要注意的是,我有大约 50 个提供这种类型 JSON 数据的 URL。我正在使用以下代码对其进行迭代,但我无法将每个 URL 的每个响应存储在 Python pandas 数据框中。它将是仅存储在那里的最后一个响应。

我还想将整个数据框转换为 CSV。

将 URL 响应的每个结果的响应附加到数据帧然后转换为 CSV 的最佳方法是什么?

Python代码如下:

import requests
from urllib.request import Request, urlopen
from urllib.request import urlopen, URLError, HTTPError
import urllib.error
import json
import pandas as pd
from pandas.io.json import json_normalize
import os
import csv

#This CSV file where we are getting ID and iterating over it for each url for get JSON data for the each URL
ConnID_data_read=pd.read_csv('ConnID.csv', delimiter = ',')
df = pd.DataFrame(ConnID_data_read)



user_iics_loginURL='https://xx-us.xxx.com/ma/api/v2/user/login'

headers = {
'Content-Type': "application/json",
'Accept': "application/json",
'cache-control': "no-cache"

}


payload = "{\r\n\"@type\": \"login\",\r\n\"username\": \"xx@xx.com.xx\",\r\n\"password\": \"xxxx\"\r\n}"

response = requests.request("POST", user_iics_loginURL, data=payload, headers=headers)
resp_obj = json.loads(response.text)
session_id = resp_obj['SessionId']
server_URL = resp_obj['serverUrl']
print(session_id)
Finaldf = pd.DataFrame()
for index, row in df.iterrows():

api_ver="/api/v2/connection/"+row['id']
#https://xx-us.xxx.com/saas/api/v2/connection/001ZOZ0B000000000066
conndetails_url = server_URL+api_ver
print(conndetails_url)
act_headers = {
'icSessionId': session_id,
'Content-Type': "application/json",
'cache-control': "no-cache",

}
act_response = requests.get(conndetails_url.strip(),headers=act_headers)
print(act_response.text)
print("Creating Data Frame on this***********************")
act_json_data= json.loads(act_response.text)
flat_json = json_normalize(act_json_data)
print(flat_json)
Conndf = pd.DataFrame(flat_json)

Finaldf.append(Conndf)
Finaldf.to_csv('NewTest.csv')

最佳答案

我首先注意到的是:

flat_json = json_normalize(act_json_data)
print(flat_json)
Conndf = pd.DataFrame(flat_json)

当你执行 flat_json = json_normalize(act_json_data) 时,flat_json 已经是一个数据帧。执行 Conndf = pd.DataFrame(flat_json) 是不必要且多余的,尽管不会造成问题,但它只是您不需要的额外代码。

其次是问题。附加数据帧时,需要将其设置为等于自身。所以改变:

Finaldf.append(Conndf)

Finaldf = Finaldf.append(Conndf)

我也只是休息索引,因为这只是我追加数据帧时的习惯:

Finaldf = Finaldf.append(Conndf).reset_index(drop=True)

除了那 ​​1 行,它看起来不错,您应该使用 Finaldf.to_csv('NewTest.csv') 将完整的数据帧保存到 csv 中

关于python - 来自迭代多个 URL 的 JSON 响应以存储 DataFrame,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55000334/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com