gpt4 book ai didi

r - 与 as.POSIXct 相比,为什么 lubridate 函数如此慢?

转载 作者:行者123 更新时间:2023-12-03 01:10:38 25 4
gpt4 key购买 nike

正如标题所说。为什么 lubridate 函数这么慢?

library(lubridate)
library(microbenchmark)

Dates <- sample(c(dates = format(seq(ISOdate(2010,1,1), by='day', length=365), format='%d-%m-%Y')), 50000, replace = TRUE)

microbenchmark(as.POSIXct(Dates, format = "%d-%b-%Y %H:%M:%S", tz = "GMT"), times = 100)
microbenchmark(dmy(Dates, tz ="GMT"), times = 100)

Unit: milliseconds
expr min lq median uq max
1 as.POSIXct(Dates, format = "%d-%b-%Y %H:%M:%S", tz = "GMT") 103.1902 104.3247 108.675 109.2632 149.871
2 dmy(Dates, tz = "GMT") 184.4871 194.1504 197.8422 214.3771 268.4911

最佳答案

出于同样的原因,与 riding on top of rockets 相比,汽车速度较慢。增加的易用性和安全性使汽车比火箭慢得多,但被炸毁的可能性更小,而且汽车的启动、转向和制动也更容易。然而,在正确的情况下(例如,我需要登上月球),火箭是完成这项工作的正确工具。现在,如果有人发明了一辆车顶绑着火箭的汽车,我们就会有所收获。

首先看看 dmy 正在做什么,您会看到速度的差异(顺便说一下,从您的基准来看,我不会说 lubridate 是慢得多,因为这些以毫秒为单位):

dmy #在命令行中键入此内容,您将得到:

>dmy
function (..., quiet = FALSE, tz = "UTC")
{
dates <- unlist(list(...))
parse_date(num_to_date(dates), make_format("dmy"), quiet = quiet,
tz = tz)
}
<environment: namespace:lubridate>

我立即看到 parse_datenum_to_datemake_format。让人不禁好奇这些家伙到底是什么人。让我们看看:

parse_date

> parse_date
function (x, formats, quiet = FALSE, seps = find_separator(x),
tz = "UTC")
{
fmt <- guess_format(head(x, 100), formats, seps, quiet)
parsed <- as.POSIXct(strptime(x, fmt, tz = tz))
if (length(x) > 2 & !quiet)
message("Using date format ", fmt, ".")
failed <- sum(is.na(parsed)) - sum(is.na(x))
if (failed > 0) {
message(failed, " failed to parse.")
}
parsed
}
<environment: namespace:lubridate>

num_to_date

> getAnywhere(num_to_date)
A single object matching ‘num_to_date’ was found
It was found in the following places
namespace:lubridate
with value

function (x)
{
if (is.numeric(x)) {
x <- as.character(x)
x <- paste(ifelse(nchar(x)%%2 == 1, "0", ""), x, sep = "")
}
x
}
<environment: namespace:lubridate>

make_format

> getAnywhere(make_format)
A single object matching ‘make_format’ was found
It was found in the following places
namespace:lubridate
with value

function (order)
{
order <- strsplit(order, "")[[1]]
formats <- list(d = "%d", m = c("%m", "%b"), y = c("%y",
"%Y"))[order]
grid <- expand.grid(formats, KEEP.OUT.ATTRS = FALSE, stringsAsFactors = FALSE)
lapply(1:nrow(grid), function(i) unname(unlist(grid[i, ])))
}
<environment: namespace:lubridate>

哇,我们得到了strsplit-tingexpand-ing.grid-spaste-ingifelse-ingunname-ing 等,加上正在进行的全部错误检查(播放 Zep 歌曲)。所以我们这里有一些很好的语法糖。嗯,很好吃,但它是有代价的,速度。

将其与 as.POSIXct 进行比较:

getAnywhere(as.POSIXct)  #tells us to use methods to see the business
methods('as.POSIXct') #tells us all the business
as.POSIXct.date #what I believe your code is using (I don't use dates though)

as.POSIXct 进行了更多的内部编码和更少的错误检查,因此您必须问我想要轻松和安全还是速度和功能?取决于工作。

关于r - 与 as.POSIXct 相比,为什么 lubridate 函数如此慢?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10645815/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com