gpt4 book ai didi

r - 当日期格式无法识别时,如何减去 R 中的日期列?

转载 作者:行者123 更新时间:2023-12-04 08:38:50 29 4
gpt4 key购买 nike

这个问题在这里已经有了答案:





calculating time difference in R

(2 个回答)


1年前关闭。




我事先尝试了以下方法,但没有成功:
Changing date format in R

t1$date <- dmy(t1$date_admission)
我一直在尝试计算两列之间的时间差。不知何故,R 无法识别其中一个格式的 Y-m-d 并返回一个错误的值,如下所示:
> [1] "2020-06-07" "2020-09-07" "2020-02-08" "2020-08-15" "2020-08-15" "2020-08-18" "2020-08-25" "2020-08-29" "2020-06-30"
[10] "2020-05-07" "2020-07-15" "2020-08-14" "2020-01-09" "2020-09-09" "2020-12-09" "2020-02-07" "2020-09-07" "2020-02-08"
[19] "2020-08-15" "2020-02-09" "2020-06-07" "2020-06-07" "2020-07-29" "2020-08-16" "2020-08-21" "2020-08-22" "2020-01-07"
[28] "2020-04-07" "2020-02-07" "2020-01-09" "2020-06-07" "2020-09-08" "2020-10-08" "2020-08-14" "2020-08-27" "2020-08-30"
[37] "2020-07-16" "2020-07-23" "2020-09-14" "2020-01-07" "2020-04-07" "2020-07-07" "2020-07-07" "2020-10-07" "2020-07-25"
[46] "2020-03-08" "2020-08-31" "2020-02-07" "2020-06-07" "2020-08-13" "2020-08-24" "2020-01-07" "2020-07-18" "2020-09-15"
[55] "2020-01-07" "2020-07-07" "2020-07-17" "2020-07-27" "2020-08-14" "2020-10-09" "2020-09-14" "2020-04-08" "2020-01-07"
[64] "2020-01-07" "2020-12-07" "2020-07-27" "2020-04-08" "2020-08-16" "2020-02-07" "2020-07-07" "2020-07-20" "2020-08-19"
[73] "2020-03-09" "2020-05-09"

> print(df$data_inicio_sint)
[1] "2020-06-27" NA "2020-07-29" NA "2020-07-31" "2020-08-19" "2020-08-22" "2020-08-18" "2020-06-29"
[10] "2020-06-25" "2020-07-14" "2020-05-09" "2020-01-10" "2020-08-31" "2020-08-30" "2020-06-28" "2020-09-08" "2020-07-23"
[19] "2020-12-09" "2020-08-22" "2020-04-08" "2020-06-25" "2020-07-20" "2020-08-16" "2020-12-09" "2020-08-23" "2020-06-30"
[28] "2020-06-26" "2020-03-31" "2020-08-23" "2020-06-21" "2020-07-29" "2020-07-29" "2020-08-01" "2020-08-19" "2020-08-14"
[37] "2020-06-30" "2020-07-22" "2020-09-10" "2020-07-01" "2020-02-08" "2020-06-08" "2020-06-23" "2020-06-27" "2020-07-17"
[46] "2020-07-29" "2020-08-31" "2020-06-20" "2020-03-08" "2020-02-09" "2020-08-24" "2020-01-08" "2020-06-08" "2020-10-10"
[55] "2020-06-23" "2020-05-08" "2020-10-08" "2020-07-24" "2020-07-09" "2020-08-29" "2020-10-10" "2020-02-09" "2020-06-23"
[64] "2020-06-22" "2020-08-08" "2020-07-21" "2020-07-28" "2020-05-09" "2020-06-19" "2020-07-08" "2020-07-14" "2020-10-09"
[73] "2020-01-10" "2020-12-09"

> diff(df$data_int_uti - df$data_inicio_sint)
Time differences in days
[1] NA NA NA NA -16 4 8 -10 -50 50 96 -98 10 92 -243 141 -165 50 -79 255 -78 27 -9
[24] -110 109 -174 95 27 -174 213 55 30 -58 -5 8 0 -15 3 -180 235 -30 -15 88 -94 -151 143
[47] -134 225 95 -186 -1 41 -65 -143 228 -143 86 33 5 -67 85 -227 1 288 -115 -117 210 -232 132
[70] 7 -57 110 -273
预期结果:出现症状日期和入院日期之间的时间间隔,以天为单位,例如
(2020-06-07) - (2020-06-27) = 20 天
所以输出看起来像
[1] 20
等等
任何光线将不胜感激。
这是输出:

dput(t1)structure(list(data_int_uti = structure(c(18420, 18512, 18300,18489, 18489, 18492, 18499, 18503, 18443, 18389, 18458, 18488,18270, 18514, 18605, 18299, 18512, 18300, 18489, 18301, 18420,18420, 18472, 18490, 18495, 18496, 18268, 18359, 18299, 18270,18420, 18513, 18543, 18488, 18501, 18504, 18459, 18466, 18519,18268, 18359, 18450, 18450, 18542, 18468, 18329, 18505, 18299,18420, 18487, 18498, 18268, 18461, 18520, 18268, 18450, 18460,18470, 18488, 18544, 18519, 18360, 18268, 18268, 18603, 18470,18360, 18490, 18299, 18450, 18463, 18493, 18330, 18391), class = "Date"),data_inicio_sint = structure(c(18440, NA, 18472, NA, 18474,18493, 18496, 18492, 18442, 18438, 18457, 18391, 18271, 18505,18504, 18441, 18513, 18466, 18605, 18496, 18360, 18438, 18463,18490, 18605, 18497, 18443, 18439, 18352, 18497, 18434, 18472,18472, 18475, 18493, 18488, 18443, 18465, 18515, 18444, 18300,18421, 18436, 18440, 18460, 18472, 18505, 18433, 18329, 18301,18498, 18269, 18421, 18545, 18436, 18390, 18543, 18467, 18452,18503, 18545, 18301, 18436, 18435, 18482, 18464, 18471, 18391,18432, 18451, 18457, 18544, 18271, 18605), class = "Date")), row.names = c(NA,-74L), class = c("tbl_df", "tbl", "data.frame"))

最佳答案

diff是计算日期之间差异的错误函数。您可以直接减去日期。

t1$date_admission - t1$date_symptoms
#Time differences in days
# [1] -20 NA -172 NA 15 -1 3 11 1 -49 1 97 -1 9 101
#[16] -142 -1 -166 -116 -195 60 -18 9 0 -110 -1 -175 -80 -53 -227
#[31] -14 41 71 13 8 16 16 1 4 -176 59 29 14 102 8
#[46] -143 0 -134 91 186 0 -1 40 -25 -168 60 -83 3 36 41
#[61] -26 59 -168 -167 121 6 -111 99 -133 -1 6 -51 59 -214
您可能正在尝试使用 difftime :
difftime(t1$date_admission, t1$date_symptoms, units = "days")
diff函数减去连续值。参见例如:
diff(c(5, 9, 4, 5))
#[1] 4 -5 1
其中计算为 (9 - 5 = 4) , (4 - 9 = -5)(5 - 4 = 1) .在您的情况下,您首先减去日期,然后申请 diff在他们身上获得连续数字之间的差异。

关于r - 当日期格式无法识别时,如何减去 R 中的日期列?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64660219/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com