gpt4 book ai didi

python - urllib.request.urlopen 不接受带空格的查询字符串

转载 作者:行者123 更新时间:2023-11-28 19:53:57 32 4
gpt4 key购买 nike

我正在参加一个关于 python 的大胆类(class),我们应该在该类(class)中检查文档中的亵渎性词语。我正在使用网站 http://www.wdylike.appspot.com/?q= (text_to_be_checked_for_profanity)。要检查的文本可以作为上述 URL 中的查询字符串传递,网站将在检查亵渎词后返回 true 或 false。下面是我的代码。

import urllib.request

# Read the content from a document
def read_content():

quotes = open("movie_quotes.txt")
content = quotes.read()
quotes.close()
check_profanity(content)



def check_profanity(text_to_read):
connection = urllib.request.urlopen("http://www.wdylike.appspot.com/?q="+text_to_read)
result = connection.read()
print(result)
connection.close

read_content()

它给了我以下错误

Traceback (most recent call last):
File "/Users/Vrushita/Desktop/Rishit/profanity_check.py", line 21, in <module>
read_content()
File "/Users/Vrushita/Desktop/Rishit/profanity_check.py", line 11, in read_content
check_profanity(content)
File "/Users/Vrushita/Desktop/Rishit/profanity_check.py", line 16, in check_profanity
connection = urllib.request.urlopen("http://www.wdylike.appspot.com/?q="+text_to_read)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 163, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 472, in open
response = meth(req, response)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 582, in http_response
'http', request, response, code, msg, hdrs)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 510, in error
return self._call_chain(*args)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 444, in _call_chain
result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 590, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request

我试图从中读取内容的文档包含一个字符串“Hello world”但是,如果我将该字符串更改为“Hello+world”,则相同的代码工作并返回所需的结果。有人可以解释为什么会这样吗?解决方法是什么?

最佳答案

urllib 接受它,服务器 不接受。它不应该,因为空格不是 valid URL character .

使用 urllib.parse.quote_plus() 正确转义您的查询字符串;它会确保你的字符串是 valid for use in query parameters .或者更好的是,使用 urllib.parse.urlencode() function对所有键值对进行编码:

from urllib.parse import urlencode

params = urlencode({'q': text_to_read})
connection = urllib.request.urlopen(f"http://www.wdylike.appspot.com/?{params}")

关于python - urllib.request.urlopen 不接受带空格的查询字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41211028/

32 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com