gpt4 book ai didi

python - 如何使用漂亮的汤和python获取favicon

转载 作者:太空宇宙 更新时间:2023-11-03 12:29:28 28 4
gpt4 key购买 nike

我写了一些愚蠢的代码只是为了学习,但它对任何网站都不起作用。这是代码:

import urllib2, re
from BeautifulSoup import BeautifulSoup as Soup

class Founder:
def Find_all_links(self, url):
page_source = urllib2.urlopen(url)
a = page_source.read()
soup = Soup(a)

a = soup.findAll(href=re.compile(r'/.a\w+'))
return a
def Find_shortcut_icon (self, url):
a = self.Find_all_links(url)
b = ''
for i in a:
strre=re.compile('shortcut icon', re.IGNORECASE)
m=strre.search(str(i))
if m:
b = i["href"]
return b
def Save_icon(self, url):
url = self.Find_shortcut_icon(url)
print url
host = re.search(r'[0-9a-zA-Z]{1,20}\.[a-zA-Z]{2,4}', url).group()
opener = urllib2.build_opener()
icon = opener.open(url).read()
file = open(host+'.ico', "wb")
file.write(icon)
file.close()
print '%s icon successfully saved' % host
c = Founder()
print c.Save_icon('http://lala.ru')

最奇怪的是它适用于网站: http://habrahabr.ru http://5pd.ru

但不适用于我检查过的大多数其他人。

最佳答案

你让它变得比需要的复杂得多。这是一个简单的方法:

import urllib
page = urllib.urlopen("http://5pd.ru/")
soup = BeautifulSoup(page)
icon_link = soup.find("link", rel="shortcut icon")
icon = urllib.urlopen(icon_link['href'])
with open("test.ico", "wb") as f:
f.write(icon.read())

关于python - 如何使用漂亮的汤和python获取favicon,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/4674460/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com