gpt4 book ai didi

Python-网页抓取 : TypeError: string indices must be integers

转载 作者:行者123 更新时间:2023-12-01 01:54:58 25 4
gpt4 key购买 nike

我正在尝试网络抓取网站并将数据转换为 csv 文件以用于练习目的,当我到达收集数据并将其存储到变量中的程度时,我收到此错误:

TypeError: string indices must be integers

关于这一行:

 email = address['email'].strip()

我希望它收集所有要写入 csv 文件的数据。整个代码如下:

from urllib.request import urlopen as uReq
import json
import re
import csv

my_url = 'https://www.haart.co.uk/umbraco/api/branches/getsales/HRT'
uClient = uReq(my_url)
page_json = uClient.read()
uClient.close()
records = []
filename = 'haartscrape.csv'

addresses = json.loads(page_json)

for address in addresses:
headline = address['headline']
address = re.sub(r'\<.*?\>', '', address['address'])
email = address['email'].strip()
tel = address['telephone']


records.append({'Name':headline, 'Address':address, 'Email': email, 'Telephone':tel})

with open(filename, 'w') as f:
writer = csv.DictWriter(f, ['Name', 'Address', 'Email', 'Telephone'])
writer.writeheader()
for r in records:
writer.writerow(r)

完整回溯:

Traceback (most recent call last):
File "haart_webscrape.py", line 18, in <module>
email = address['email'].strip()
TypeError: string indices must be integers

感谢任何帮助。预先感谢您。

最佳答案

您正在重新分配 JSON 元素

for address in addresses:
headline = address['headline']
address = # here

重命名循环变量,或其他

或者这样做

with open(filename, 'w') as f:
writer = csv.DictWriter(f, ['Name', 'Address', 'Email', 'Telephone'])
writer.writeheader()
for address in addresses:
r = {
'Name':address['headline'],
'Address':re.sub(r'\<.*?\>', '', address['address'],
'Email': address['email'].strip(),
'Telephone':address['telephone']}
writer.writerow(r)

关于Python-网页抓取 : TypeError: string indices must be integers,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50317869/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com