df<-data.frame(old=c(1,1,1,5,7,7,7,11,13,13,16,18,20,20,20,20,25,25,25,29),
new=c(1,1,1,2,3,3,3,4,5,5,6,7,8,8,8,8,9,9,9,10))
old new
1 1 1
2 1 1
3 1 1
4 5 2
5 7 3
6 7 3
7 7 3
8 11 4
9 13 5
10 13 5
11 16 6
12 18 7
13 20 8
14 20 8
15 20 8
16 20 8
17 25 9
18 25 9
19 25 9
20 29 10
How do I turn old
into new
easily? Basically it's the order of values repeated the same number of times. The values are always increasing.
我怎么才能轻易地把旧的变成新的?基本上,它是重复相同次数的值的顺序。这些值总是在增加。
Thanks in advance. I don't even know how to Google something this simple.
先谢谢你。我甚至不知道怎么用谷歌搜索这么简单的东西。
更多回答
优秀答案推荐
You could use dplyr::consecutive_id
or data.table::rleid
to get an identifier for each run of identical values:
您可以使用dplyr::Continucative_id或data.table::rleid来获取每次运行相同值的标识符:
df <- data.frame(
old = c(1, 1, 1, 5, 7, 7, 7, 11, 13, 13, 16, 18, 20, 20, 20, 20, 25, 25, 25, 29)
)
library(dplyr, warn.conflicts = FALSE)
library(data.table)
df |>
mutate(new = consecutive_id(old))
#> old new
#> 1 1 1
#> 2 1 1
#> 3 1 1
#> 4 5 2
#> 5 7 3
#> 6 7 3
#> 7 7 3
#> 8 11 4
#> 9 13 5
#> 10 13 5
#> 11 16 6
#> 12 18 7
#> 13 20 8
#> 14 20 8
#> 15 20 8
#> 16 20 8
#> 17 25 9
#> 18 25 9
#> 19 25 9
#> 20 29 10
df |>
mutate(new = rleid(old))
#> old new
#> 1 1 1
#> 2 1 1
#> 3 1 1
#> 4 5 2
#> 5 7 3
#> 6 7 3
#> 7 7 3
#> 8 11 4
#> 9 13 5
#> 10 13 5
#> 11 16 6
#> 12 18 7
#> 13 20 8
#> 14 20 8
#> 15 20 8
#> 16 20 8
#> 17 25 9
#> 18 25 9
#> 19 25 9
#> 20 29 10
You're essentially just checking when the old column changes. You can do this in a variety of ways, but the easiest is checking if the previous value in df$old matches the current one, then find the cumulative sum of that:
本质上,您只是检查旧列何时发生更改。您可以使用多种方法来完成此操作,但最简单的方法是检查df$old中的前一个值是否与当前值匹配,然后求出其累计和:
df$new <- cumsum(df$old != dplyr::lag(df$old, default = 0))
or
或
df$new <- cumsum(df$old != c(0, df$old[-nrow(df)]))
or
或
library(dplyr)
df |> mutate(new = cumsum(old != lag(df$old, default = 0)))
or
或
df$new <- cumsum(diff(c(0, df$old)) > 0)
or
或
df |> mutate(new = cumsum(diff(c(0, old)) > 0)
更多回答
Thank you! I should have known it was in dplyr
but had no idea how to phrase a search for it.
谢谢!我应该知道它在dplyr,但我不知道如何措辞搜索它。
Thanks. Yeah, these all work. I had a clunky ifelse
method along the same lines but i figured there must be a specific function that does this, and of course there is in dplyr
and also data.table
. See the answer from @stefan
谢谢。是的,这些都起作用了。我使用了一个类似的笨重的ifElse方法,但我认为一定有一个特定的函数来完成这项工作,当然dplyr和data.table中也有这个函数。请看@stefan的答案
我是一名优秀的程序员,十分优秀!