gpt4 book ai didi

python - 如何创建一个文件并在其中保存抓取的数据?

转载 作者:太空宇宙 更新时间:2023-11-03 14:29:46 26 4
gpt4 key购买 nike

我已经制作了这个脚本,但我尝试了几个选项来保存数据,但我一直搞乱代码。如何将提取的数据保存到csv或excel文件中?

import requests
from bs4 import BeautifulSoup

base_url = "http://www.privredni-imenik.com/pretraga?abcd=&keyword=&cities_id=0&category_id=0&sub_category_id=0&page=1"
current_page = 1

while current_page < 200:
print(current_page)
url = base_url + str(current_page)
#current_page += 1
r = requests.get(url)
zute_soup = BeautifulSoup(r.text, 'html.parser')
firme = zute_soup.findAll('div', {'class': 'jobs-item'})

for title in firme:
title1 = title.findAll('h6')[0].text
print(title1)
adresa = title.findAll('div', {'class': 'description'})[0].text
print(adresa)
kontakt = title.findAll('div', {'class': 'description'})[1].text
print(kontakt)
print('\n')
page_line = "{title1}\n{adresa}\n{kontakt}".format(
title1=title1,
adresa=adresa,
kontakt=kontakt
)
current_page += 1

最佳答案

获取 CSV 的一个简单方法是打印以逗号分隔的每一行,然后使用操作系统的“>”写入文件。

import csv
import requests
from bs4 import BeautifulSoup

base_url = "http://www.privredni-imenik.com/pretraga?abcd=&keyword=&cities_id=0&category_id=0&sub_category_id=0&page=1"
current_page = 1


with open('scrape_results.csv', 'w', newline='') as scrape_results:
csvwriter = csv.writer(scrape_results)

while current_page < 200:
url = base_url + str(current_page)
r = requests.get(url)
zute_soup = BeautifulSoup(r.text, 'html.parser')
firme = zute_soup.findAll('div', {'class': 'jobs-item'})

for title in firme:
title1 = title.findAll('h6')[0].text
adresa = title.findAll('div', {'class': 'description'})[0].text
kontakt = title.findAll('div', {'class': 'description'})[1].text
csvwriter.writerow([current_page, title1, adresa, kontakt])

current_page += 1

关于python - 如何创建一个文件并在其中保存抓取的数据?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47382567/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com