gpt4 book ai didi

python - 使用 Tweepy 提取一周的推文

转载 作者:太空宇宙 更新时间:2023-11-03 21:03:11 25 4
gpt4 key购买 nike

我想将推文存储在 CSV 中,我使用 tweepy 并设法将其存储在 CVS 中,但它只提取一天的数据。我想提取并存储数据一周,而不需要每天提取。

这就是我所做的:

def tweets_to_data_frame(public_tweets):
df = pd.DataFrame(data=[tweet.text for tweet in public_tweets], columns=['Tweets'])
df['len'] = np.array([len(tweet.text) for tweet in public_tweets])
df['date'] = np.array([tweet.created_at for tweet in public_tweets])
df['retweets'] = np.array([tweet.retweet_count for tweet in public_tweets])
df['lang'] = np.array([tweet.lang for tweet in public_tweets])
return df

public_tweet= api.search('donald trump')
df = tweets_to_data_frame(public_tweet)
df.to_csv('donaldtrump.csv')
df.head(15)
Tweets len date retweets lang
0 RT @mehdirhasan: Stephen Miller’s Jewish uncle... 140 2019-04-09 11:08:23 67 en
1 RT @errollouis: "If the House ever gets his re... 140 2019-04-09 11:08:23 7927 en
2 RT @BillKristol: "This is what Kirstjen Nielse... 140 2019-04-09 11:08:22 73 en
3 RT @Newsweek: Trump claimed he wouldn't have t... 140 2019-04-09 11:08:21 7 en
4 RT @mehdirhasan: Stephen Miller’s Jewish uncle... 140 2019-04-09 11:08:20 67 en
5 The real reason Donald Trump just fired the he... 112 2019-04-09 11:08:19 0 en
6 RT @BillKristol: "This is what Kirstjen Nielse... 140 2019-04-09 11:08:19 73 en
7 RT @BobbyEberle13: Ilhan Omar is now praying f... 140 2019-04-09 11:08:18 457 en
8 The guy met the queen last time out and lots o... 140 2019-04-09 11:08:17 0 en
9 RT @PalmerReport: Donald Trump’s deconstructio... 135 2019-04-09 11:08:17 107 en
10 RT @ByronYork: Donald Trump has been paying ta... 139 2019-04-09 11:08:16 1232 en
11 RT @mehdirhasan: Stephen Miller’s Jewish uncle... 140 2019-04-09 11:08:16 67 en
12 RT @SayWhenLA: 🚨 YUGE !!\n\nPresident Donald J... 140 2019-04-09 11:08:15 1316 en
13 "As long as you're going to be thinking anyway... 100 2019-04-09 11:08:15 0 en
14 RT @TheLastRefuge2: Diana West Discusses The R... 140 2019-04-09 11:08:15 113 en

我想要的是一周的数据,

我的想法是:

def tweets_to_data_frame1(public_tweets):
for tweets in tweepy.Cursor(api.search,q = (public_tweets),count=100,
since = "2019-04-04",
until = "2019-04-07").items():
df = pd.DataFrame(data=[tweets.text for tweet in tweets], columns=['Tweets'])
df['len'] = np.array([len(tweets.text) for tweet in tweets])
df['date'] = np.array([tweets.created_at for tweet in tweets])
df['retweets'] = np.array([tweets.retweet_count for tweet in tweets])
df['lang'] = np.array([tweets.lang for tweet in tweets])

return df

df1 = tweets_to_data_frame1('donald trump')

错误:

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-24-96745c16c99c> in <module>
----> 1 df1 = tweets_to_data_frame1('donald trump')

<ipython-input-23-e5866a4adb3f> in tweets_to_data_frame1(public_tweets)
3 since = "2019-04-04",
4 until = "2019-04-07").items():
----> 5 df = pd.DataFrame(data=[tweets.text for tweet in tweets], columns=['Tweets'])
6
7 #df['id'] = np.array([tweet.id for tweet in tweets])

TypeError: 'Status' object is not iterable

预期结果:

Tweets  len date    retweets    lang
0 RT @mehdirhasan: Stephen Miller’s Jewish uncle... 140 2019-04-09 11:08:23 67 en
1 RT @errollouis: "If the House ever gets his re... 140 2019-04-09 11:08:23 7927 en
2 RT @BillKristol: "This is what Kirstjen Nielse... 140 2019-04-09 11:08:22 73 en
3 RT @Newsweek: Trump claimed he wouldn't have t... 140 2019-04-09 11:08:21 7 en
4 RT @mehdirhasan: Stephen Miller’s Jewish uncle... 140 2019-04-09 11:08:20 67 en
5 The real reason Donald Trump just fired the he... 112 2019-04-09 11:08:19 0 en
6 RT @BillKristol: "This is what Kirstjen Nielse... 140 2019-04-09 11:08:19 73 en
7 RT @BobbyEberle13: Ilhan Omar is now praying f... 140 2019-04-09 11:08:18 457 en
8 The guy met the queen last time out and lots o... 140 2019-04-09 11:08:17 0 en
9 RT @PalmerReport: Donald Trump’s deconstructio... 135 2019-04-09 11:08:17 107 en
10 RT @ByronYork: Donald Trump has been paying ta... 139 2019-04-09 11:08:16 1232 en
11 RT @mehdirhasan: Stephen Miller’s Jewish uncle... 140 2019-04-09 11:08:16 67 en
12 RT @SayWhenLA: 🚨 YUGE !!\n\nPresident Donald J... 140 2019-04-09 11:08:15 1316 en
13 "As long as you're going to be thinking anyway... 100 2019-04-09 11:08:15 0 en
14 RT @TheLastRefuge2: Diana West Discusses The R... 140 2019-04-09 11:08:15 113 en

但是一周

最佳答案

所以我猜问题出在这里:

for tweets in tweepy.Cursor(api.search,q = (public_tweets),count=100,since = "2019-04-04",until = "2019-04-07").items():

tweepy.Cursor(...).items() 是一个列表。因此,tweets 变量的每个值都是一条推文。然后您尝试使用列表理解,因此您尝试迭代该单个推文。这正是错误消息告诉您的内容。

你可以这样做:

tweets = tweepy.Cursor(...).items()
df = pd.DataFrame(data=[tweet.text for tweet in tweets], columns=['Tweets'])

顺便说一句,我还会重命名 def tweets_to_data_frame1(public_tweets) 的 public_tweets 参数:

public_tweets 参数在本例中只是一个搜索查询字符串,因此名称具有误导性

关于python - 使用 Tweepy 提取一周的推文,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55591831/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com