gpt4 book ai didi

r - 从R中的字符中提取持续时间

转载 作者:行者123 更新时间:2023-12-04 01:28:39 24 4
gpt4 key购买 nike

我目前面临一个需要分析的数据集问题。以下是这些数据的示例:

      session_id    individ_id  colony     species           year_tracked
1 12141_2009-07-01 GBT_FP96194 Eynhallow Northern fulmar 2009_10
2 12141_2010-07-18 GBT_FP96235 Eynhallow Northern fulmar 2010_11
3 12143_2009-07-01 GBT_FC14766 Eynhallow Northern fulmar 2009_10
4 12143_2010-07-18 GBT_FR77883 Eynhallow Northern fulmar 2010_12
5 12144_2009-07-01 GBT_FP05030 Eynhallow Northern fulmar 2009_10
6 12145_2009-07-01 GBT_FA82356 Eynhallow Northern fulmar 2009_10

我需要创建一个包含跟踪年数的新列,在这种情况下将是:

2010-2009 --> 1
2011-2010 --> 1
2010-2009 --> 1
2012-2010 --> 2
2010-2009 --> 1
2010-2009 --> 1

year_tracked 列是一个character 类。也许采用单元格的前 4 个字符和最后两个字符并将其转换为日期的函数可以工作,但我不知道该怎么做。

最佳答案

带有separate的选项

library(dplyr)
library(tidyr)
library(stringr)
df1 %>%
mutate(year_tracked2 = str_replace(year_tracked, "_", "_20")) %>%
separate(year_tracked2, into = c('year1', 'year2'), convert = TRUE) %>%
mutate(n = year2 - year1) %>%
select(-year1, -year2)
# session_id individ_id colony species year_tracked n
#1 12141_2009-07-01 GBT_FP96194 Eynhallow Northern fulmar 2009_10 1
#2 12141_2010-07-18 GBT_FP96235 Eynhallow Northern fulmar 2010_11 1
#3 12143_2009-07-01 GBT_FC14766 Eynhallow Northern fulmar 2009_10 1
#4 12143_2010-07-18 GBT_FR77883 Eynhallow Northern fulmar 2010_12 2
#5 12144_2009-07-01 GBT_FP05030 Eynhallow Northern fulmar 2009_10 1
#6 12145_2009-07-01 GBT_FA82356 Eynhallow Northern fulmar 2009_10 1

或者一个更简单的选择是用 :20 替换 _ 并只做一个 evaluation

library(purrr)
df1 %>%
mutate(n = lengths(map(str_replace(year_tracked, "_", ":20"),
~ eval(parse(text = .x))))- 1)

数据

df1 <- structure(list(session_id = c("12141_2009-07-01", "12141_2010-07-18", 
"12143_2009-07-01", "12143_2010-07-18", "12144_2009-07-01", "12145_2009-07-01"
), individ_id = c("GBT_FP96194", "GBT_FP96235", "GBT_FC14766",
"GBT_FR77883", "GBT_FP05030", "GBT_FA82356"), colony = c("Eynhallow",
"Eynhallow", "Eynhallow", "Eynhallow", "Eynhallow", "Eynhallow"
), species = c("Northern fulmar", "Northern fulmar", "Northern fulmar",
"Northern fulmar", "Northern fulmar", "Northern fulmar"), year_tracked = c("2009_10",
"2010_11", "2009_10", "2010_12", "2009_10", "2009_10")), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))

关于r - 从R中的字符中提取持续时间,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61395910/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com