gpt4 book ai didi

r - r 中同时点位置之间的距离

转载 作者:行者123 更新时间:2023-12-04 10:55:25 29 4
gpt4 key购买 nike

我正在计算“同时”记录的 UTM 位置之间的距离(以米为单位),但我遇到了问题。现在写的方式我只计算“时间最近”的只有 1 个个体之间的距离。我想让它计算所有在时间上“接近”的个体之间的距离。

在我的示例中,我有 3 只驼鹿和 3 只狼。我想取驼鹿 1 并计算同时记录的狼 1、狼 2、狼 3 的位置之间的距离。现在脚本只搜索任何狼之间的绝对最小时间差并计算那 1 只狼的距离而不是所有其他人。

这是我的测试数据:

驼鹿位置数据:

structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L), .Label = c("F07001",
"F07010", "M07012"), class = "factor"), x = c(1482445L, 1481274L,
1481279L, 1481271L, 1480849L, 1480881L, 1480883L, 1480880L, 1482448L,
1482494L, 1482534L, 1482534L, 1482553L, 1482555L, 1482414L, 1482852L,
1476120L, 1476104L, 1476101L), y = c(6621768L, 6619628L, 6619630L,
6619700L, 6620321L, 6620427L, 6620438L, 6620423L, 6616403L, 6616408L,
6616395L, 6616408L, 6616406L, 6616418L, 6616755L, 6616312L, 6623655L,
6623646L, 6623652L), date = structure(c(1173088800, 1173096000,
1173103260, 1173110400, 1173117600, 1173211200, 1173218400, 1173139200,
1173088800, 1173096000, 1173103260, 1173110400, 1173117600, 1173211200,
1173218400, 1173139200, 1173270600, 1173277800, 1173282960), class = c("POSIXct",
"POSIXt"), tzone = "UTC")), .Names = c("id", "x", "y", "date"
), row.names = c(NA, -19L), class = "data.frame")

狼位置数据:

structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L), .Label = c("HF7572",
"Htest", "UM1347"), class = "factor"), x = c(1480610L, 1480640L,
1480613L, 1480613L, 1480555L, 1480567L, 1480627L, 1480532L, 1480593L,
1484394L, 1484394L, 1483940L, 1483933L, 1483935L, 1483930L, 1483855L,
1483793L, 1483802L, 1484392L, 1483855L), y = c(6619853L, 6619739L,
6619759L, 6619862L, 6619838L, 6619772L, 6619902L, 6619899L, 6619887L,
6619589L, 6619602L, 6619899L, 6619907L, 6619905L, 6619896L, 6619834L,
6619702L, 6619672L, 6619558L, 6619834L), date = structure(c(1173088800,
1173096060, 1173103440, 1173111600, 1173117780, 1173213600, 1173218400,
1173141120, 1173266100, 1173095940, 1173099600, 1173103200, 1173106920,
1173110400, 1173208800, 1173211200, 1173222000, 1173266100, 1173362100,
1173211200), class = c("POSIXct", "POSIXt"), tzone = "UTC")), .Names = c("id",
"x", "y", "date"), row.names = c(NA, -20L), class = "data.frame")

到目前为止,这是我的脚本:

mloc=read.csv("moose.csv", head = T)
wloc=read.csv("wolf.csv", head=T)
mloc$date<-as.POSIXct(strptime(mloc$date,"%Y-%m-%d %H:%M"),tz="UTC")
wloc$date<-as.POSIXct(strptime(wloc$date,"%Y-%m-%d %H:%M"),tz="UTC")

#sort the data sequentially by date time then convert to number
Sortmoose = mloc[order(mloc$date),]
Sortwolf = wloc[order(wloc$date),]
m <- as.numeric(Sortmoose$date)
w <- as.numeric(Sortwolf$date)

#Creates index of the time intervals
id <- findInterval(m, w, all.inside=TRUE)
id_min <- ifelse(abs(m-w[id])<abs(m-w[id+1]), id, id+1)
Sortmoose$wolfID = Sortwolf$id[id_min]
Sortmoose$wolfdate =Sortwolf$date[id_min]
Sortmoose$wolfx = Sortwolf$x[id_min]
Sortmoose$wolfy = Sortwolf$y[id_min]
Sortmoose$dist= sqrt((Sortmoose$wolfx-Sortmoose$x)^2+(Sortmoose$wolfy-Sortmoose$y)^2)

我想计算每对驼鹿/狼之间的距离,只要在“同一”时间记录位置即可。我希望输出包含驼鹿信息和相关的狼信息以及这两点之间的距离(以米为单位)。我也想要时差,所以我可以过滤掉那些> 45分钟或类似的东西,但这是我想我可以稍后做的事情。基本上是这样的:mooseID mooseDate mooseX mooseY wolfID wolfDate wolfX wolfY Distance(m) TimeDiff (min)

最佳答案

新解决方案。这是执行您想要的操作的代码(近似匹配)。关键思想是创建一个包含新列 date1 的新数据表,这样原始数据中的每个 date = 05:17:13 都会有 date1 = 04:00:0005:00:0006:00:00(以及所有其他重复的列),然后进行合并针对这个新专栏。这将保证原始数据中彼此相隔一小时内的每两个事件将被合并。

之后我们只计算距离和时间差。

请注意,使用 data.table 对速度至关重要,因为您的数据帧非常大 - 使用常规 data.frame 会太慢。

library(data.table)
library(lubridate)

mloc <- data.table(mloc)
wloc <- data.table(wloc)

# Returns a new data table with one new column (date1) and length(range)
# rows for each row in the initial data table, duplicating all other fields.
# Example: for row with date = '2013-01-15 05:17:23' and for the default range
# argument it will add rows with date1 = '2013-01-15 04:00:00', '2013-01-15 05:00:00'
# and '2013-01-15 06:00:00'
AddTimeBoundaries <- function(dt, range = -1:1) {
dt1 <- rbindlist(lapply(range,
function(x) data.table(id = dt$id, date = dt$date,
date1 = floor_date(dt$date, 'hour') +
hours(x))))
setkey(dt1, id, date)
setkey(dt, id, date)
result <- dt[dt1]
setkey(result, date1)
result
}

mloc.1 <- AddTimeBoundaries(mloc)
wloc.1 <- AddTimeBoundaries(wloc)

x <- mloc.1[wloc.1, allow.cartesian = TRUE][!is.na(id)]
result <- unique(x[, list(id, date, x, y, id.1, date.1, x.1, y.1,
distance = sqrt((x-x.1)^2 + (y-y.1)^2),
time.diff = date - date.1)])

结果包含 1 小时内的所有事件(有时在 2 小时内,但您可以轻松过滤掉这些事件)。

> head(result, 10)
id date x y id.1 date.1 x.1 y.1 distance time.diff
1: F07001 2007-03-05 10:00:00 1482445 6621768 HF7572 2007-03-05 10:00:00 1480610 6619853 2652.2538 0 secs
2: M07012 2007-03-05 10:00:00 1482448 6616403 HF7572 2007-03-05 10:00:00 1480610 6619853 3909.0592 0 secs
3: F07001 2007-03-05 10:00:00 1482445 6621768 UM1347 2007-03-05 11:59:00 1484394 6619589 2923.4640 -7140 secs
4: M07012 2007-03-05 10:00:00 1482448 6616403 UM1347 2007-03-05 11:59:00 1484394 6619589 3733.2977 -7140 secs
5: F07001 2007-03-05 12:00:00 1481274 6619628 HF7572 2007-03-05 10:00:00 1480610 6619853 701.0856 7200 secs
6: M07012 2007-03-05 12:00:00 1482494 6616408 HF7572 2007-03-05 10:00:00 1480610 6619853 3926.5100 7200 secs
7: F07001 2007-03-05 10:00:00 1482445 6621768 HF7572 2007-03-05 12:01:00 1480640 6619739 2715.6705 -7260 secs
8: F07001 2007-03-05 12:00:00 1481274 6619628 HF7572 2007-03-05 12:01:00 1480640 6619739 643.6435 -60 secs
9: M07012 2007-03-05 10:00:00 1482448 6616403 HF7572 2007-03-05 12:01:00 1480640 6619739 3794.4380 -7260 secs
10: M07012 2007-03-05 12:00:00 1482494 6616408 HF7572 2007-03-05 12:01:00 1480640 6619739 3812.2011 -60 secs

旧解决方案这行不通,因为 OP 需要日期的大致匹配(1 小时内),而不是精确匹配。

假设我正确解释了你的问题,这里是使用 data.table 包的解决方案。我将您的测试数据中的第一个结构称为 mloc,将第二个结构称为 wloc

第 1 步。将两个数据帧转换为 data.table 并在 date 上设置键:

library(data.table)
mloc <- data.table(mloc)
wloc <- data.table(wloc)
setkey(mloc, date)
setkey(wloc, date)

第 2 步。通过 date 键合并两个表,创建“笛卡尔积”并计算距离:

x <- mloc[wloc, allow.cartesian = TRUE][!is.na(id)]
x[, distance := sqrt((x-x.1)^2 + (y-y.1)^2)]

> x
date id x y id.1 x.1 y.1 distance
1: 2007-03-05 10:00:00 F07001 1482445 6621768 HF7572 1480610 6619853 2652.2538
2: 2007-03-05 10:00:00 M07012 1482448 6616403 HF7572 1480610 6619853 3909.0592
3: 2007-03-05 16:00:00 F07001 1481271 6619700 UM1347 1483935 6619905 2671.8759
4: 2007-03-05 16:00:00 M07012 1482534 6616408 UM1347 1483935 6619905 3767.2019
5: 2007-03-06 20:00:00 F07001 1480881 6620427 UM1347 1483855 6619834 3032.5443
6: 2007-03-06 20:00:00 M07012 1482555 6616418 UM1347 1483855 6619834 3655.0042
7: 2007-03-06 20:00:00 F07001 1480881 6620427 Htest 1483855 6619834 3032.5443
8: 2007-03-06 20:00:00 M07012 1482555 6616418 Htest 1483855 6619834 3655.0042
9: 2007-03-06 22:00:00 F07001 1480883 6620438 HF7572 1480627 6619902 593.9966
10: 2007-03-06 22:00:00 M07012 1482414 6616755 HF7572 1480627 6619902 3618.9747

关于r - r 中同时点位置之间的距离,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15646365/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com