gpt4 book ai didi

r - 比较 R 中的两个日期

转载 作者:行者123 更新时间:2023-12-04 01:09:21 25 4
gpt4 key购买 nike

我有一个导入到 R 的制表符分隔文本文件。我使用以下命令进行导入:

data = read.table(soubor, header = TRUE, sep = "\t", dec = ".", colClasses =c("numeric","numeric","character","Date","numeric","numeric"))

当我跑 str(data)检查我得到的列的数据类型:
'data.frame':   211931 obs. of  6 variables:
$ DataValue : num 0 0 0 0 0 0 0 0 0 NA ...
$ SiteID : num 1 1 1 1 1 1 1 1 1 1 ...
$ VariableCode: chr "Sucho" "Sucho" "Sucho" "Sucho" ...
$ DateTimeUTC : Date, format: "2012-07-01" "2012-07-02" "2012-07-03" "2012-07-04" ...
$ Latitude : num 50.8 50.8 50.8 50.8 50.8 ...
$ Longitude : num 15.6 15.6 15.6 15.6 15.6 ...

我的数据的前 20 行的可重现样本在这里:

my_sample = dput(data[1:20,])


structure(list(DataValue = c(0, 0, 0, 0, 0, 0, 0, 0, 0, NA, NA, 
NA, NA, NA, NA, NA, NA, 0, 0, 0), SiteID = c(1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), VariableCode = c("Sucho",
"Sucho", "Sucho", "Sucho", "Sucho", "Sucho", "Sucho", "Sucho",
"Sucho", "Sucho", "Sucho", "Sucho", "Sucho", "Sucho", "Sucho",
"Sucho", "Sucho", "Sucho", "Sucho", "Sucho"), DateTimeUTC = structure(c(15522,
15523, 15524, 15525, 15526, 15527, 15528, 15529, 15530, 15531,
15532, 15533, 15534, 15535, 15536, 15537, 15538, 15539, 15540,
15541), class = "Date"), Latitude = c(50.77, 50.77, 50.77, 50.77,
50.77, 50.77, 50.77, 50.77, 50.77, 50.77, 50.77, 50.77, 50.77,
50.77, 50.77, 50.77, 50.77, 50.77, 50.77, 50.77), Longitude = c(15.55,
15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55,
15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55,
15.55)), .Names = c("DataValue", "SiteID", "VariableCode", "DateTimeUTC",
"Latitude", "Longitude"), row.names = c(NA, 20L), class = "data.frame")

现在我想按日期过滤我的表格。请注意,我在 for 中运行我的代码环形。首先,我在 2012 年 7 月 1 日之前对数据进行子集化并进行一些处理。然后,我在 7 月 2 日之前对我的数据进行子集化并进行一些处理,等等。例如,我想获取日期等于 2012 年 7 月 6 日的所有行。我尝试了以下代码:
startDate = as.Date("2012-07-01");
endDate = as.Date("2012-07-20");
all_dates = seq(startDate, endDate, 1);

#the following code I'm trying to run inside a loop...
for (j in 1:length(all_dates)) {
filterdate = all_dates[j];
my_subset = my_sample[my_sample$DateTimeUTC == filterdate,]
#now I want do do some processing on my_subset...
}

但是上面的代码从循环的第 7 步开始返回一个空数据集。

因此,例如:
subset_one = my_sample[my_sample$DateTimeUTC == all_dates[6],]

返回: 3 obs of 6 variables .

但是,出于某种未知的原因,示例:
subset_two = my_sample[my_sample$DateTimeUTC == all_dates[7],]

返回: 0 obs of 6 variables .

(注意:我编辑了上面的代码,使我的问题 100% 可重现)

任何想法我做错了什么?

最佳答案

以下解决方案解决了我的问题:
而不是使用 Date数据类型,我尝试使用 POSIXct数据类型。
这是读取制表符分隔文本文件的示例代码,之后子集在我的 for 的所有步骤中都有效。环形:

data = read.table("data.txt", header = TRUE, sep = "\t", dec = ".", 
colClasses =c("numeric","numeric","character","POSIXct","numeric","numeric"));
startDate = as.POSIXct("2012-07-01");
endDate = as.POSIXct("2012-07-20");
all_dates = seq(startDate, endDate, 86400); #86400 is num of seconds in a day

#the following code I'm trying to run inside a loop...
for (j in 1:length(all_dates)) {
filterdate = all_dates[j];
my_subset = data[data$DateTimeUTC == filterdate,]
#now I want do do some processing on my_subset...
}

关于r - 比较 R 中的两个日期,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21571744/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com