您好,我有一个数据集,我正在尝试根据 50 英里半径获取组簇 ID。这是数据集的结构
g_lat<- c(45.52306, 40.26719, 34.05223, 37.38605, 37.77493)
g_long<- c(-122.67648,-86.13490, -118.24368, -122.08385, -122.41942)
df<- data.frame(g_lat, g_long)
我想创建一个组集群 ID,它基本上将 50 英里半径内的位置分组。让我知道如何实现这一目标?非常感谢。以下是预期输出。
g_lat g_long clusterid
45.52306 -122.67648 1
40.26719 -86.13490 2
34.05223 -118.24368 3
37.38605 -122.08385 4
37.77493 -122.41942 4
最佳答案
g_lat<- c(45.52306, 40.26719, 34.05223, 37.38605, 37.77493)
g_long<- c(-122.67648,-86.13490, -118.24368, -122.08385, -122.41942)
df<- data.frame(point = c(1:5), longitude = g_long, latitude = g_lat)
library(sf)
my.sf.point <- st_as_sf(x = df,
coords = c("longitude", "latitude"),
crs = "+proj=longlat +datum=WGS84")
#distance matrix in feet
st_distance(my.sf.point)
#which poiint are within 50 miles (~80467.2 meters)
l <- st_is_within_distance(my.sf.point, dist = 80467.2 )
l
# Sparse geometry binary predicate list of length 5, where the predicate was `is_within_distance'
# 1: 1
# 2: 2
# 3: 3
# 4: 4, 5
# 5: 4, 5
df$within_50 <- rowSums(as.matrix(l))-1
df
# point longitude latitude within_50
# 1 1 -122.6765 45.52306 0
# 2 2 -86.1349 40.26719 0
# 3 3 -118.2437 34.05223 0
# 4 4 -122.0838 37.38605 1
# 5 5 -122.4194 37.77493 1
m <- as.matrix(l)
colnames(m) <- c(1:nrow(df))
rownames(m) <- c(1:nroe(df))
df$points_within_50 <- apply( m, 1, function(u) paste( names(which(u)), collapse="," ) )
df$clusterid <- dplyr::group_indices(df, df$points_within_50)
# point longitude latitude within_50 points_within_50 clusterid
# 1 1 -122.6765 45.52306 0 1 1
# 2 2 -86.1349 40.26719 0 2 2
# 3 3 -118.2437 34.05223 0 3 3
# 4 4 -122.0838 37.38605 1 4,5 4
# 5 5 -122.4194 37.77493 1 4,5 4
关于r - 根据 50 英里半径对位置进行分组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52356653/