python - 在 Tweepy 中循环后保存为 DataFrame，无需循环即可工作，添加循环后，另存为列表-6ren

python - 在 Tweepy 中循环后保存为 DataFrame，无需循环即可工作，添加循环后，另存为列表

转载作者：行者123 更新时间：2023-12-01 08:27:03

24

4

问题:在 Twitter 上拉取多个用户时间线以保存为 DataFrame。

这是一个一次适合一个用户的完美解决方案:

import tweepy
import pandas as pd
import numpy as np

ACCESS_TOKEN = ""
ACCESS_TOKEN_SECRET = ""
CONSUMER_KEY = ""
CONSUMER_SECRET = ""

# OAuth process, using the keys and tokens
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)

# Creation of the actual interface, using authentication
api = tweepy.API(auth, wait_on_rate_limit=True)


# Running only on handle returns a dataframe 
tweets = api.user_timeline(screen_name='pycon', count=10)
print("Number of tweets extracted: {}.\n".format(len(tweets)))
data = pd.DataFrame(data=[tweet.text for tweet in tweets], columns= ['Tweets'])
data['len']  = np.array([len(tweet.text) for tweet in tweets])
data['ID']   = np.array([tweet.id for tweet in tweets])
data['Date'] = np.array([tweet.created_at for tweet in tweets])
data['Source'] = np.array([tweet.source for tweet in tweets])
data['Likes']  = np.array([tweet.favorite_count for tweet in tweets])
data['RTs']    = np.array([tweet.retweet_count for tweet in tweets])

print(data)

上面的代码效果很好，将在 DataFrame 中返回用户 pycon 最近的 10 条推文。下一步是添加多个要查询的句柄。以下是使用多个句柄执行相同操作的代码:

#Added list of handles
handles = ['pycon', 'gvanrossum']
#Added Empty DF to fill
test = []
#Added loop
for handle in handles:
    tweets = api.user_timeline(screen_name=handle, count=10)
    print("Number of tweets extracted: {}.\n".format(len(tweets)))
    data = pd.DataFrame(data=[tweet.text for tweet in tweets], columns=['Tweets'])
    data['len']  = np.array([len(tweet.text) for tweet in tweets])
    data['ID']   = np.array([tweet.id for tweet in tweets])
    data['Date'] = np.array([tweet.created_at for tweet in tweets])
    data['Source'] = np.array([tweet.source for tweet in tweets])
    data['Likes']  = np.array([tweet.favorite_count for tweet in tweets])
    data['RTs']    = np.array([tweet.retweet_count for tweet in tweets])
    test.append(data)

print(test)

运行此命令将给出两个输出。 data 将是一个 DataFrame，其中包含 gvanrossum 的 10 条最新推文(作为句柄列表中的第二个句柄，这是有道理的)。第二个输出是 test，它是一个列表。有趣的是，test 包含来自 pycon 和 gvansossum 的全部 20 条推文，但采用列表形式。循环正在工作，但它没有保存为 DataFrame。

问题:如何将多个句柄之间的循环保存为 DataFrame？

最佳答案

如果您想将数据存储在单个数据库中

merged=pd.DataFrame()
#Added loop
for handle in handles:
    tweets = api.user_timeline(screen_name=handle, count=10)
    print("Number of tweets extracted: {}.\n".format(len(tweets)))
    data = pd.DataFrame(data=[tweet.text for tweet in tweets], columns=['Tweets'])
    data['len']  = np.array([len(tweet.text) for tweet in tweets])
    data['ID']   = np.array([tweet.id for tweet in tweets])
    data['Date'] = np.array([tweet.created_at for tweet in tweets])
    data['Source'] = np.array([tweet.source for tweet in tweets])
    data['Likes']  = np.array([tweet.favorite_count for tweet in tweets])
    data['RTs']    = np.array([tweet.retweet_count for tweet in tweets])
    #created new column handle to identify the source of tweet. Can comment if you do not need.
    data.loc['Handle',:]=handle
    #merging the data frames
    merged=pd.concat([merged,data])
print(merged)

关于python - 在 Tweepy 中循环后保存为 DataFrame，无需循环即可工作，添加循环后，另存为列表，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54171195/

24

4

0

文章推荐： asp.net - 如何在SimpleModal对话框中显示值？

文章推荐： jquery 1.2.7 如何替换一些文本？

文章推荐： python - 当 ConnectionRefusedError : [WinError 10061] 时重试

tweepy - 在 Tweepy 中使用 search_full_archive
有人知道如何使用 tweepy 模块中的 search_full_archive 功能吗？我使用了类似于常规搜索的光标功能，但它不起作用。 import tweepy as tw api_key =
tweepy - 在 Tweepy 中使用 search_full_archive
有人知道如何使用 tweepy 模块中的 search_full_archive 功能吗？我使用了类似于常规搜索的光标功能，但它不起作用。 import tweepy as tw api_key =
tweepy - 如何使用 Tweepy(无分页)获得关注者数量？
我正在使用此功能从 id 获取所有关注者: def getAllFollowers(id): followers = tweepy.Cursor(api.followers, id = id)
tweepy - 如何使用 Anacondas 和 easy_install 安装 Tweepy
我在 Anacondas Python 3.3 中安装 Tweepy 时遇到问题。首先，我转到了 Python 3.3 安装的脚本目录。然后我跑了 easy_install tweepy 如果我在默
twitter - tweepy 未授权 - tweepy.error.TweepError : Not authorized
当我尝试使用 tweepy 进行 Twitter 身份验证时，出现以下错误。 File "/usr/local/lib/python2.7/dist-packages/tweepy/models.
Python - 导入 tweepy ImportError : No module named tweepy
我安装了 pip install tweepy 并且安装没有错误。 Requirement already satisfied: tweepy in /Library/Python/2.7/site-
python - 导入错误 : No module named 'tweepy.streaming' ; 'tweepy' is not a package
我在 PyCharm ed 4 中编写了这个 tweepy 源代码。 from __future__ import absolute_import, print_function from tweep
python - 无法导入 tweepy(Python 3.7 - tweepy 3.5)
我在尝试在我的代码中导入 tweepy 库时遇到问题，我尝试卸载它然后重新安装它，但我仍然遇到同样的问题。这是我的代码(错误)和命令行: 谢谢。最佳答案 Tweepy 还不适用于 python 3
python-2.7 - Tweepy 遍历 tweepy.Cursor(api.friends).items()
我正在尝试获取用户的 friend 并将他们附加到给定条件的列表中: for friend in tweepy.Cursor(api.friends).items(): if friend n
python - Tweepy 3.10.0，属性错误: module 'tweepy' has no attribute 'Client'
我正在尝试将 Twitter API 的版本 2 与 tweepy 3.10.0 一起使用，但在遵循文档 https://docs.tweepy.org/en/latest/client.html 时
python-2.7 - 如何使用 tweepy.Cursor 和 api.search 从 Tweepy 中提取 Hashtags？
通过应用 tweepy.Cursor 和 api.search 方法(如下所示)，Tweepy 在提取我需要的所有其他信息(主题标签除外)方面做得很好。我从文档中知道 Hashtags 在这个结构状态
python - Tweepy - 是否可以流式传输准确的短语？
TheStreamer.filter(languages=["en"], track=['Bruno Mars is lovely']) 有没有办法让它跟踪一个确切的短语，而不是相关的短语？例如， '
python - Tweepy 没有返回给定数量的推文
我有以下代码 api = tweepy.API(auth,wait_on_rate_limit=True) for tweet in tweepy.Cursor(api.search,
for-loop - tweepy 中的循环错误
我最近一直在玩 tweepy，我试图拉动给定用户的关注者和关注者。 followingids = [] followids = [] userid = "Someone"#sets target
twitter - Tweepy:查找特定语言的所有推文
我想提取所有国家/地区的所有阿拉伯语推文。我修改了这个 tutorial中的代码。这是我的搜索查询。api.search(q="*", count=tweetsPerQry, lang ['ar']
twitter - Tweepy:查找特定语言的所有推文
我想提取所有国家/地区的所有阿拉伯语推文。我修改了这个 tutorial中的代码。这是我的搜索查询。api.search(q="*", count=tweetsPerQry, lang ['ar']
python - tweepy 上未指定目标用户错误
当我尝试调用一些方法(例如 show_friendship)时，我收到错误 raise TweepError(error_msg, resp, api_code=api_error_code) twe
python - Tweepy 一次查找多条推文的扩展推文？
我正在使用 tweepy 访问大量推文。许多推文都被截断了，所以我想获取一些推文的全文，我有这些推文的 id。我的问题是:tweepy api 实例有一种一次下载多条推文的方法 (api.statu
python - Tweepy 光标搜索查询可以遍历从文件加载的单词列表吗？
我正在尝试让我的机器人能够通过搜索 API 搜索多个关键字。到目前为止我已经: f = open('swear.txt', 'r') search = f.read().splitlines() f.
python - 使用地理位置跟踪主题标签 - tweepy
我想跟踪特定主题标签的所有推文，但我只需要带有地理位置的推文。此行无法正常工作，结果都是带有地理位置的推文。 stream.filter(track=["hashtag"],locations = G

首页

博学

6Ren·AI

商城

python - 在 Tweepy 中循环后保存为 DataFrame，无需循环即可工作，添加循环后，另存为列表