gpt4 book ai didi

r - 从数据框中查找到特定位置最近的城市

转载 作者:行者123 更新时间:2023-12-04 13:03:34 26 4
gpt4 key购买 nike

下面的数据框包含有关纬度、经度、州和城市的信息。我想找到
数据框中给出的每个城市的三个最近的城市。例如,从下面
数据框,俄克拉荷马城和距离阿尔伯克基最近的科拉拉多斯普林德,所以三个距离阿尔伯克基最近的城市应该是
保存在名为nearest_AL的其他数据框中(我不知道如何得到这个结果,我试图通过创建一个数据框来给出一个想法)。

dataframe<-data.frame(long=c("-106.61291","-81.97224","-84.42770","-72.68604","-97.60056","-104.70261"),
lat=c("35.04333","33.37378","33.64073","41.93887","35.39305","38.80171"),
state=c("NM","GA","GA","TX","OK","CO"),
city=c("Albuquerque","Augusta","Atlanta","Windsor Locks","Oklahoma City","Colarado Springs")
)

nearest_Al<-data.frame(long=c("-97.60056","-104.70261"),
lat=c("35.39305","38.80171"),
state=c("OK","CO"),
city=c("Oklahoma City","Colarado Springs")
)

我必须在包含 500k 行和大约 100 个位置的数据帧上执行同样的操作。

提前致谢!

最佳答案

这是一个想法。 dataframe2是最终的输出。 Near_City列显示 city 中每个城市距离最近的前三个城市柱子。

library(dplyr)
library(sp)
library(rgdal)
library(sf)

# Create example data frame
dataframe<-data.frame(long=c("-106.61291","-81.97224","-84.42770","-72.68604","-97.60056","-104.70261"),
lat=c("35.04333","33.37378","33.64073","41.93887","35.39305","38.80171"),
state=c("NM","GA","GA","TX","OK","CO"),
city=c("Albuquerque","Augusta","Atlanta","Windsor Locks","Oklahoma City","Colarado Springs"),
stringsAsFactors = FALSE
)

# Create spatial point data frame object
dataframe_sp <- dataframe %>%
mutate(long = as.numeric(long), lat = as.numeric(lat))
coordinates(dataframe_sp) <- ~long + lat

# Convert to sf object
dataframe_sf <- st_as_sf(dataframe_sp)

# Set projection
st_crs(dataframe_sf) <- 4326

# Calculate the distance
dist_m <- st_distance(dataframe_sf, dataframe_sf)

# Select the closet three cities
# Remove the first row, and then select the first three rows
index <- apply(dist_m, 1, order)
index <- index[2:nrow(index), ]
index <- index[1:3, ]

# Rep each city by three
dataframe2 <- dataframe[rep(1:nrow(dataframe), each = 3), ]

# Process the dataframe based on index, store the results in Near_City column
dataframe2$Near_City <- dataframe[as.vector(index), ]$city

更新

我们可以进一步创建 OP 想要的输出。
dataframe3 <- dataframe[as.vector(index), ]
dataframe3$TargetCity <- dataframe2$city

nearest_city_list <- split(dataframe3, f = dataframe3$TargetCity)

现在每个“目标城市”都是列表中的一个元素 nearest_city_list .要访问数据,我们可以使用目标城市名称访问列表元素。这是一个提取阿尔伯克基结果的示例:
nearest_city_list[["Albuquerque"]]
long lat state city TargetCity
6 -104.70261 38.80171 CO Colarado Springs Albuquerque
5 -97.60056 35.39305 OK Oklahoma City Albuquerque
3 -84.42770 33.64073 GA Atlanta Albuquerque

关于r - 从数据框中查找到特定位置最近的城市,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45576214/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com