- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我已将 R 连接到 Twitter 并使用 R 中的 searchTwitter
函数进行抓取,并清除标点符号、小写字母等结果数据。现在我正在尝试执行以下操作:
x 轴 - 月份
;y 轴 - 推文数量
)。我想将其用于转推、提及、回复和收藏。
这是我到目前为止尝试过的:
#load the packages into R
>library(twitteR)
>library(plyr)
>library(ggplot2)
# Register an application (API) at https://apps.twitter.com/
# Look up the API key and create a token – you need for both the key and the secret
# Assign the keys to variables and use the authorization
api_key <- “your API key from twitter”
api_secret <- “your Secret key from twitter”
access_token <- “you Access Token from twitter”
access_token_secret <- “you Access Token Secret key from twitter”
setup_twitter_oauth(api_key,api_secret,access_token,access_token_secret)
1 "Using direct authentication" Use a local file to cache OAuth access credentials between R sessions?
1: Yes
2: No
# Type 1 and press Enter
Selection: 1
auctiontweets <- searchTwitter("auction", since = "2015-01-01", until = "2015-08-03", n=1000)
但是,我在创建数据框时遇到问题,出现以下错误:
tweet.dataframe <- data.frame(searchTwitter("action", since = "2015-01-01", until = "2015-08-03", n=3000))
Error in as.data.frame.default(x[[i]], optional = TRUE) :
cannot coerce class "structure("status", package = "twitteR")" to a data.frame
我找到了关于如何按小时绘制用户的代码;但无法对其进行修改,使其适用于每月带有特定词(即“拍卖”)的推文:
yultweets <- searchTwitter("#accessyul", n=1500)
y <- twListToDF(yultweets)
y$created <- as.POSIXct(format(y$created, tz="America/Montreal"))
yply <- ddply(y, .var = "screenName", .fun = function(x) {return(subset(x,
created %in% min(created), select = c(screenName,created)))})
yplytime <- arrange(yply,-desc(created))
y$screenName=factor(y$screenName, levels = yplytime$screenName)
ggplot(y) + geom_point(aes(x=created,y=screenName)) + ylab("Twitter username") + xlab("Time")
出处可查here .
最佳答案
由于您甚至没有提供我们可以处理的一小部分数据,因此我的回答可能很肤浅。
library(stringi); library(dplyr); library(SciencesPo)
df <- data.frame(tweets = c("blah, blah, Blah, auction","blah, auction", "blah, blah", "this auction, blah", "today"), date=c('2015-07-01','2015-06-01','2015-05-01','2015-07-31','2015-05-01'))
> df
tweets date
1 blah, blah, Blah, auction 2015-07-01
2 blah, auction 2015-06-01
3 blah, blah 2015-05-01
4 this auction, blah 2015-07-31
5 today 2015-05-01
filter = "auction"
> df$n <- vapply(df$tweets, function(x) sum(stri_count_fixed(x, filter)), 1L)
> df
tweets date n
1 blah, blah, Blah, auction 2015-07-01 1
2 blah, auction 2015-06-01 1
3 blah, blah 2015-05-01 0
4 this auction, blah 2015-07-31 1
5 today 2015-05-01 0
那么,唯一的总结就是:
df %>% group_by(month=format(as.Date(date),format="%m")) %>% summarize(freq=sum(n))
%>%ungroup() -> df2
> df2
Source: local data frame [3 x 2]
month freq
1 05 0
2 06 1
3 07 2
>
瞧!奖励,将其绘制为 ggplot(df2, aes(x=month, y=freq)) + geom_line() + theme_pub()
关于r - 按月绘制单词的推文计数/频率,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31796744/
我想开发一个 Skype 机器人,它将用户名作为输入,并根据用户输入以相反的字符大小写表示hello username。简而言之,如果用户输入他的名字 james,我的机器人会回复他为 Hello J
我是一名优秀的程序员,十分优秀!