gpt4 book ai didi

facing problem with specific web scrapping(面临特定纸幅报废的问题)

转载 作者:bug小助手 更新时间:2023-10-25 17:12:23 27 4
gpt4 key购买 nike



i am still learning web scrapping
however i am writing a code to scrape a specific web site and access its content
this web site :-

我仍然在学习网络报废,但是我正在编写一个代码来抓取一个特定的网站并访问它的内容:


"https://tenders.etimad.sa/Qualification/QualificationsForVisitor"

“https://tenders.etimad.sa/Qualification/QualificationsForVisitor”


how ever this code return an error
but on another this website

但是在另一个This网站上,这个代码会返回错误


"https://wuzzuf.net/search/jobs/?q=&a=hpb"

“https://wuzzuf.net/search/jobs/?q=&a=hpb”


the code run correctly
I asked chat GPT and he replied that the website prevent anybody to access it content, how ever using "ScrapeStorm" the application succeed to reach the website content

代码运行正常我问Chat GPT,他回答说网站阻止任何人访问它的内容,应用程序如何使用SCrapeStorm成功访问网站内容


so do i do any mistakes ? what i this ? and how to access the website content

那么我做错了什么吗?这是什么?以及如何访问网站内容


import requests
from bs4 import BeautifulSoup
import csv
from itertools import zip_longest

#link of website
result = requests.get("https://tenders.etimad.sa/Qualification/QualificationsForVisitor")
src = result.content
soup = BeautifulSoup(src,"lxml")
print(soup)

更多回答
优秀答案推荐

You need to add a False to the verify of requests. Here you go:

您需要在验证请求时添加FALSE。这是给你的:


import requests
from bs4 import BeautifulSoup
import csv
from itertools import zip_longest

#link of website
result = requests.get("https://tenders.etimad.sa/Qualification/QualificationsForVisitor", verify = False)
src = result.content
soup = BeautifulSoup(src, "lxml")
print(soup)

Outputs:

产出:


<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8"/>
<link href="/Etimad-UI/assets/imgs/favicon.ico" rel="icon" type="image/x-icon"/>
<link href="/Etimad-UI/assets/imgs/favicon.ico" rel="shortcut icon" type="image/x-icon"/>
<link href="/Etimad-UI/assets/imgs/favicon.ico" rel="icon"/>
<meta content="IE=edge,chrome=1" http-equiv="X-UA-Compatible"/>
<title>
جميع دعوات التأهيل
</title>
<meta content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=0, shrink-to-fit=no" name="viewport"/>
<!-- Fonts and icons -->
<link href="/Etimad-UI/assets/css/font-awesome.min.css" rel="stylesheet"/>
<!-- CSS Just for demo purpose, don't include it in your project -->
<link href="/Etimad-UI/assets/scss/Etimad/etd-style.min.css" rel="stylesheet"/>
<link href="/Etimad-UI/assets/scss/Etimad/NewAddedStyle.min.css" rel="stylesheet"/>
<script src="/Etimad-UI/assets/js/jquery-3.6.0.min.js"></script>
<style>
*::-webkit-scrollbar-thumb {
background-color: #29ad6f;
}
</style>
</head>
<body class="landing-page sidebar-collapse RTL">
...
<script src="/Etimad-UI/assets/js/EUM-Files/adrum.js"></script>
</body>
</html>

更多回答

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com