gpt4 book ai didi

Python urllib.urlopen IOError 错误

转载 作者:太空宇宙 更新时间:2023-11-03 13:05:58 24 4
gpt4 key购买 nike

所以我在一个函数中有如下几行代码

sock = urllib.urlopen(url)
html = sock.read()
sock.close()

当我手动调用该函数时,它们工作正常。但是,当我在循环中调用该函数时(使用与之前相同的 url)我收到以下错误:

> Traceback (most recent call last):
File "./headlines.py", line 256, in <module>
main(argv[1:])
File "./headlines.py", line 37, in main
write_articles(headline, output_folder + "articles_" + term +"/")
File "./headlines.py", line 232, in write_articles
print get_blogs(headline, 5)
File "/Users/michaelnussbaum08/Documents/College/Sophmore_Year/Quarter_2/Innovation/Headlines/_code/get_content.py", line 41, in get_blogs
sock = urllib.urlopen(url)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib.py", line 87, in urlopen
return opener.open(url)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib.py", line 203, in open
return getattr(self, name)(url)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib.py", line 314, in open_http
if not host: raise IOError, ('http error', 'no host given')
IOError: [Errno http error] no host given

有什么想法吗?

编辑更多代码:

def get_blogs(term, num_results):
search_term = term.replace(" ", "+")
print "search_term: " + search_term
url = 'http://blogsearch.google.com/blogsearch_feeds?hl=en&q='+search_term+'&ie=utf-8&num=10&output=rss'
print "url: " +url

#error occurs on line below

sock = urllib.urlopen(url)
html = sock.read()
sock.close()

def write_articles(headline, output_folder, num_articles=5):

#calls get_blogs

if not os.path.exists(output_folder):
os.makedirs(output_folder)

output_file = output_folder+headline.strip("\n")+".txt"
f = open(output_file, 'a')
articles = get_articles(headline, num_articles)
blogs = get_blogs(headline, num_articles)


#NEW FUNCTION
#the loop that calls write_articles
for term in trend_list:
if do_find_max == True:
fill_search_term(term, output_folder)
headlines = headline_process(term, output_folder, max_headlines, do_find_max)
for headline in headlines:
try:
write_articles(headline, output_folder + "articles_" + term +"/")
except UnicodeEncodeError:
pass

最佳答案

当我将一个变量与 url 连接时,我遇到了这个问题,在你的情况下 search_term

url = 'http://blogsearch.google.com/blogsearch_feeds?hl=en&q='+search_term+'&ie=utf-8&num=10&output=rss'

末尾有一个换行符。所以一定要这样做

search_term = search_term.strip()

你可能还想做

search_term = urllib2.quote(search_term)

确保您的字符串对于 url 是安全的

关于Python urllib.urlopen IOError 错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/2672315/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com