gpt4 book ai didi

python - 使用 Python 检索 Twitter 数据时出现 Unicode 解码错误

转载 作者:太空宇宙 更新时间:2023-11-03 13:42:59 24 4
gpt4 key购买 nike

当检索特定阿拉伯语关键字的 Twitter 数据时,如下所示:

#imports
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener

#setting up the keys
consumer_key = '………….'
consumer_secret = '…………….'
access_token = '…………..'
access_secret = '……...'

class TweetListener(StreamListener):
# A listener handles tweets are the received from the stream.
#This is a basic listener that just prints received tweets to standard output

def on_data(self, data):
print (data)
return True

def on_error(self, status):
print (status)

#printing all the tweets to the standard output
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)

stream = Stream(auth, TweetListener())
stream.filter(track=['سوريا'])

我收到此错误消息:

Traceback (most recent call last):
File "/Users/Mona/Desktop/twitter.py", line 29, in <module>
stream.filter(track=['سوريا'])
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site- packages/tweepy/streaming.py", line 303, in filter
encoded_track = [s.encode(encoding) for s in track]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd8 in position 0: ordinal not in range(128)

请帮忙!!

最佳答案

我查看了 tweepy 的源代码并在 Stream 的源代码中找到了 seems to cause the problem 行.该行来自过滤方法。当您在代码中调用 stream.filter(track=['سوريا']) 时,Stream 会调用
s.encode('utf-8')
where s = 'سوريا'(查看过滤器的源代码,您会发现 utf-8 是默认编码)。正是在这一点上,代码抛出异常。

要解决这个问题,我们需要使用 Unicode 字符串。

 t = u"سوريا"
stream.filter(track=[t])

(为了清楚起见,我只是将您的字符串放入变量 t 中)。

关于python - 使用 Python 检索 Twitter 数据时出现 Unicode 解码错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26621993/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com