gpt4 book ai didi

r - 在 r 中按日期和时间对数据框进行排序和排名

转载 作者:行者123 更新时间:2023-12-04 12:27:08 27 4
gpt4 key购买 nike

我有一个如下的数据框。最初它只有两列/变量——“时间戳”(包含日期和时间)和“ Actor ”。我将“时间戳”变量分解为“日期”和“时间”,然后将“时间进一步分解为“小时”和“分钟”。然后给出以下结构

dataf<-structure(list(hours = structure(c(3L, 4L, 4L, 3L, 3L, 3L, 6L, 
6L, 6L, 6L, 6L, 2L, 2L, 2L, 2L, 5L, 5L, 5L, 1L, 1L, 2L, 2L), .Label = c("9",
"12", "14", "15", "16", "17"), class = "factor"), mins = structure(c(17L,
1L, 2L, 14L, 15L, 16L, 3L, 4L, 6L, 6L, 7L, 9L, 9L, 13L, 13L,
10L, 11L, 12L, 2L, 5L, 8L, 8L), .Label = c("00", "04", "08",
"09", "10", "12", "13", "18", "19", "20", "21", "22", "27", "39",
"51", "52", "59"), class = "factor"), date = structure(c(3L,
3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 4L, 4L,
4L, 1L, 1L, 1L, 1L), .Label = c("4/28/2014", "5/18/2014", "5/2/2014",
"5/6/2014"), class = "factor"), time = structure(c(7L, 8L, 9L,
4L, 5L, 6L, 13L, 14L, 15L, 15L, 16L, 2L, 2L, 3L, 3L, 10L, 11L,
12L, 17L, 18L, 1L, 1L), .Label = c("12:18", "12:19", "12:27",
"14:39", "14:51", "14:52", "14:59", "15:00", "15:04", "16:20",
"16:21", "16:22", "17:08", "17:09", "17:12", "17:13", "9:04",
"9:10"), class = "factor"), Timestamp = structure(c(13L, 14L,
15L, 10L, 11L, 12L, 6L, 7L, 8L, 8L, 9L, 2L, 2L, 3L, 3L, 16L,
17L, 18L, 4L, 5L, 1L, 1L), .Label = c("4/28/2014 12:18", "4/28/2014 12:19",
"4/28/2014 12:27", "4/28/2014 9:04", "4/28/2014 9:10", "5/18/2014 17:08",
"5/18/2014 17:09", "5/18/2014 17:12", "5/18/2014 17:13", "5/2/2014 14:39",
"5/2/2014 14:51", "5/2/2014 14:52", "5/2/2014 14:59", "5/2/2014 15:00",
"5/2/2014 15:04", "5/6/2014 16:20", "5/6/2014 16:21", "5/6/2014 16:22"
), class = "factor"), Actor = c(7L, 7L, 7L, 7L, 7L, 7L, 5L, 5L,
2L, 12L, 2L, 7L, 7L, 7L, 7L, 10L, 10L, 10L, 7L, 10L, 7L, 7L)), .Names = c("hours",
"mins", "date", "time", "Timestamp", "Actor"), row.names = c(NA,
-22L), class = "data.frame")

将时间戳和时间变量分解为单独变量的原因是因为在我的真实数据中,我在按数据和/或时间排序时遇到了很多问题。将这些变量分解成更小的块使得排序变得更加容易。

我现在想要做的是创建一个名为“Rank”的新变量,它将为数据框中最早的事件返回一个“1”(这将是 2014 年 4 月 28 日上午 9 点 4 点的观察),然后是一个 ' 2' 用于按日期/时间顺序进行的下一次观察,依此类推。

对数据框进行排序似乎相对简单:
dataf<-dataf[order(as.Date(dataf$date, format="%m/%d/%Y"), dataf$hours, dataf$mins),]

这可以完成工作。但我现在正在努力的是分配等级。

我试过这个,因为我已经将 'ave' 与 FUN=rank 结合使用来对整数进行排名,但它产生的结果却是可笑的错误:
dataf$rank <- ave((dataf[order(as.Date(dataf$date, format="%m/%d/%Y"), dataf$hours, dataf$mins),]),FUN=rank )

任何帮助表示赞赏

最佳答案

我不同意你对 datetime 对象的厌恶,这使这一切变得更简单:

dataf$ts <- strptime(as.character(dataf$Timestamp),'%m/%d/%Y %H:%M')
dataf <- dataf[order(dataf$ts),]
dataf$ts_rank <- rank(dataf$ts,ties.method = "min")
dataf
## hours mins date time Timestamp Actor ts ts_rank
## 19 9 04 4/28/2014 9:04 4/28/2014 9:04 7 2014-04-28 09:04:00 1
## 20 9 10 4/28/2014 9:10 4/28/2014 9:10 10 2014-04-28 09:10:00 2
## 21 12 18 4/28/2014 12:18 4/28/2014 12:18 7 2014-04-28 12:18:00 3
## 22 12 18 4/28/2014 12:18 4/28/2014 12:18 7 2014-04-28 12:18:00 3
## 12 12 19 4/28/2014 12:19 4/28/2014 12:19 7 2014-04-28 12:19:00 5
## 13 12 19 4/28/2014 12:19 4/28/2014 12:19 7 2014-04-28 12:19:00 5
## 14 12 27 4/28/2014 12:27 4/28/2014 12:27 7 2014-04-28 12:27:00 7
## 15 12 27 4/28/2014 12:27 4/28/2014 12:27 7 2014-04-28 12:27:00 7
## 4 14 39 5/2/2014 14:39 5/2/2014 14:39 7 2014-05-02 14:39:00 9
## 5 14 51 5/2/2014 14:51 5/2/2014 14:51 7 2014-05-02 14:51:00 10
## 6 14 52 5/2/2014 14:52 5/2/2014 14:52 7 2014-05-02 14:52:00 11
## 1 14 59 5/2/2014 14:59 5/2/2014 14:59 7 2014-05-02 14:59:00 12
## 2 15 00 5/2/2014 15:00 5/2/2014 15:00 7 2014-05-02 15:00:00 13
## 3 15 04 5/2/2014 15:04 5/2/2014 15:04 7 2014-05-02 15:04:00 14
## 16 16 20 5/6/2014 16:20 5/6/2014 16:20 10 2014-05-06 16:20:00 15
## 17 16 21 5/6/2014 16:21 5/6/2014 16:21 10 2014-05-06 16:21:00 16
## 18 16 22 5/6/2014 16:22 5/6/2014 16:22 10 2014-05-06 16:22:00 17
## 7 17 08 5/18/2014 17:08 5/18/2014 17:08 5 2014-05-18 17:08:00 18
## 8 17 09 5/18/2014 17:09 5/18/2014 17:09 5 2014-05-18 17:09:00 19
## 9 17 12 5/18/2014 17:12 5/18/2014 17:12 2 2014-05-18 17:12:00 20
## 10 17 12 5/18/2014 17:12 5/18/2014 17:12 12 2014-05-18 17:12:00 20
## 11 17 13 5/18/2014 17:13 5/18/2014 17:13 2 2014-05-18 17:13:00 22

关于r - 在 r 中按日期和时间对数据框进行排序和排名,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23792890/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com