gpt4 book ai didi

r - 润滑mdy功能

转载 作者:行者123 更新时间:2023-12-02 21:02:04 27 4
gpt4 key购买 nike

我正在尝试转换以下内容,但对其中一个日期 [1] 没有成功。 “4/2/10”变为“0010-04-02”。

有办法纠正这个问题吗?

谢谢,维韦克

data <- data.frame(initialDiagnose = c("4/2/10","14.01.2009", "9/22/2005", 
"4/21/2010", "28.01.2010", "09.01.2009", "3/28/2005",
"04.01.2005", "04.01.2005", "Created on 9/17/2010", "03 01 2010"))

mdy <- mdy(data$initialDiagnose)
dmy <- dmy(data$initialDiagnose)
mdy[is.na(mdy)] <- dmy[is.na(mdy)] # some dates are ambiguous, here we give
data$initialDiagnose <- mdy # mdy precedence over dmy
data

initialDiagnose
1 0010-04-02
2 2009-01-14
3 2005-09-22
4 2010-04-21
5 2010-01-28
6 2009-09-01
7 2005-03-28
8 2005-04-01
9 2005-04-01
10 2010-09-17
11 2010-03-01

最佳答案

我认为发生这种情况是因为 mdy()函数更喜欢将年份与 %Y 匹配(实际年份)超过%y (年份的2位缩写,默认为19XX或20XX)。

不过,有一个解决方法。我查看了 lubridate::parse_date_time 的帮助文件( ?parse_date_time ),并且在帮助文件底部附近,有一个添加参数的示例,该参数更喜欢与 %y 匹配。格式超过 %Y年份的格式。帮助文件中的相关代码:

## ** how to use `select_formats` argument **
## By default %Y has precedence:
parse_date_time(c("27-09-13", "27-09-2013"), "dmy")
## [1] "13-09-27 UTC" "2013-09-27 UTC"

## to give priority to %y format, define your own select_format function:

my_select <- function(trained){
n_fmts <- nchar(gsub("[^%]", "", names(trained))) + grepl("%y", names(trained))*1.5
names(trained[ which.max(n_fmts) ])
}

parse_date_time(c("27-09-13", "27-09-2013"), "dmy", select_formats = my_select)
## '[1] "2013-09-27 UTC" "2013-09-27 UTC"

因此,对于您的示例,您可以调整此代码并替换 mdy <- mdy(data$initialDiagnose)与此一致:

# Define a select function that prefers %y over %Y. This is copied 
# directly from the help files
my_select <- function(trained){
n_fmts <- nchar(gsub("[^%]", "", names(trained))) + grepl("%y", names(trained))*1.5
names(trained[ which.max(n_fmts) ])
}

# Parse as mdy dates
mdy <- parse_date_time(data$initialDiagnose, "mdy", select_formats = my_select)
# [1] "2010-04-02 UTC" NA "2005-09-22 UTC" "2010-04-21 UTC" NA
# [6] "2009-09-01 UTC" "2005-03-28 UTC" "2005-04-01 UTC" "2005-04-01 UTC" "2010-09-17 UTC"
#[11] "2010-03-01 UTC"

运行问题中剩余的代码行,它会给出这个数据框作为结果:

   initialDiagnose
1 2010-04-02
2 2009-01-14
3 2005-09-22
4 2010-04-21
5 2010-01-28
6 2009-09-01
7 2005-03-28
8 2005-04-01
9 2005-04-01
10 2010-09-17
11 2010-03-01

关于r - 润滑mdy功能,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37736131/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com