gpt4 book ai didi

python - 如何使用python从网站中提取所有链接

转载 作者:行者123 更新时间:2023-12-01 23:25:32 25 4
gpt4 key购买 nike

<分区>

我写了一个脚本来从网站中提取链接,效果很好这是源码


import requests
from bs4 import BeautifulSoup
Web=requests.get("https://www.google.com/")
soup=BeautifulSoup(Web.text,'lxml')
for link in soup.findAll('a'):
print(link['href'])

##Out put
https://www.google.com.sa/imghp?hl=ar&tab=wi
https://maps.google.com.sa/maps?hl=ar&tab=wl
https://www.youtube.com/?gl=SA&tab=w1
https://news.google.com/?tab=wn
https://mail.google.com/mail/?tab=wm
https://drive.google.com/?tab=wo
https://calendar.google.com/calendar?tab=wc
https://www.google.com.sa/intl/ar/about/products?tab=wh
http://www.google.com.sa/history/optout?hl=ar
/preferences?hl=ar
https://accounts.google.com/ServiceLogin?hl=ar&passive=true&continue=https://www.google.com/&ec=GAZAAQ
/search?safe=strict&ie=UTF-8&q=%D9%86%D9%88%D8%B1+%D8%A7%D9%84%D8%B4%D8%B1%D9%8A%D9%81&oi=ddle&ct=174786979&hl=ar&kgmid=/m/0562zv&sa=X&ved=0ahUKEwiq8feoiqDwAhUK8BQKHc7UD7oQPQgD
/advanced_search?hl=ar-SA&authuser=0
https://www.google.com/setprefs?sig=0_mwAqJUgnrqSouOmGk0UvVz7GgkY%3D&hl=en&source=homepage&sa=X&ved=0ahUKEwiq8feoiqDwAhUK8BQKHc7UD7oQ2ZgBCAU
/intl/ar/ads/
http://www.google.com/intl/ar/services/
/intl/ar/about.html
https://www.google.com/setprefdomain?prefdom=SA&prev=https://www.google.com.sa/&sig=K_e_0jdE_IjI-G5o1qMYziPpQwHgs%3D
/intl/ar/policies/privacy/
/intl/ar/policies/terms/

但问题是,当我将网站更改为 https://www.jarir.com/ 时, 它不会起作用

import requests
from bs4 import BeautifulSoup
Web=requests.get("https://www.jarir.com/")
soup=BeautifulSoup(Web.text,'lxml')
for link in soup.findAll('a'):
print(link['href'])

#out put
#

输出将是#

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com