r - 从 R 中的 alphahull 中提取多个多边形

标签 r ggplot2 geospatial igraph

我正在使用 alphahull 创建 map 边界,结果有时是离散的船体(这很好)。 。下面示例中的三个漂亮的集群。我可以使用 igraph 获取离散簇的数量,但我想关闭多边形,并且没有看到将点分配给正确簇的简单方法。我缺少什么?最终我想将对象作为多边形传递给 ggplot。

library(alphahull)
library(ggplot2)
library(igraph)
#> 
#> Attaching package: 'igraph'
#> The following objects are masked from 'package:stats':
#> 
#>     decompose, spectrum
#> The following object is masked from 'package:base':
#> 
#>     union
cluster_one.df <- data.frame("lon"=rnorm(20,135,0.2),"lat"=rnorm(20,35,0.2)) 
cluster_two.df <- data.frame("lon"=rnorm(20,130,0.2),"lat"=rnorm(20,30,0.2))
cluster_three.df <- data.frame("lon"=rnorm(20,125,0.2),"lat"=rnorm(20,25,0.2))
points.df <- rbind(cluster_one.df,cluster_two.df,cluster_three.df)
plot(points.df)


alpha_obj <- ashape(points.df, alpha=0.5)
find_no <- alpha_obj$edges
plot(alpha_obj)

Alphahull 有三个清晰的簇。 。 .


network = graph.edgelist(cbind(as.character(alpha_obj$edges[, "ind1"]), as.character(alpha_obj$edges[, "ind2"])), directed = FALSE)
plot(network)

clusters(network)$no
#> [1] 3
is.connected(network)
#> [1] FALSE

ggplot(points.df, aes(x = lon, y = lat)) +
  geom_point() +
  geom_segment(data=data.frame(alpha_obj$edges), aes(x = x1, y = y1, xend = x2, yend = y2), color="red") 

igraph 报告了正确的簇数,并且 ggplot 中的线段是正确的,但是如何将线段连接成三个多边形?

reprex package (v0.3.0) 于 2020 年 5 月 23 日创建


更新:继续使用 igraph,我能够按组对数据进行子集

alpha.df <- data.frame(alpha_obj$edges)
V(network)$comp <- components(network, "strong")$membership
group_1 <- induced_subgraph(network, V(network)$comp==1)

借用 https://rpubs.com/geospacedman/alphasimple ,这得到了点的顺序

cutg <- group_1 - E(group_1)[1] ## break loop
ends <- names(which(degree(cutg) == 1))
path <-  get.shortest.paths(cutg, ends[1], ends[2])[[1]]

names(path[[1]]) # this gets the sequence for the first group

但是,我仍然需要为正确的多边形排序线段......这似乎会导致循环。

下面 Allan Cameron 的解决方案效果很好,但当 alpha 很小(例如 0.1)时,会在这个现实世界数据集上引发此错误。然后两个 alpha 形状似乎共享一条线!

enter image description here

Error in m[, j] == m[i, j] : non-conformable arrays In addition: Warning messages: 1: In if (j == 1) xs[i] <- edge.df$x1[i] else xs[i] <- edge.df$x2[i] : the condition has length > 1 and only the first element will be used 2: In if (j == 1) ys[i] <- edge.df$y1[i] else ys[i] <- edge.df$y2[i] : the condition has length > 1 and only the first element will be used

points.df <- 
structure(list(lon = c(132.6654637, 132.628814738644, 132.539254, 
132.5553958, 132.539302181878, 132.7950567, 132.641769, 132.7684343, 
132.7888039, 132.6647853, 132.8347288, 132.8278186, 132.8982418, 
132.909312197459, 132.8625508, 132.82386, 132.809095, 132.7759964, 
132.7756813, 132.7870982, 132.7851847, 132.7681695, 132.8032465, 
132.8160503, 132.6358063, 132.635823981036, 132.5445085, 132.6403534, 
132.5769612, 132.609291, 132.6340415, 132.6381995, 132.6648159, 
132.7101352, 132.7194943, 132.7876312, 132.7825205, 132.7585091, 
132.7559927, 132.7685386, 132.7875584, 132.7872259, 132.7874004, 
132.7902362, 132.8010577, 132.7922278, 132.7861715, 132.768931, 
132.7619977, 132.7513127, 132.7391948, 132.790737970094, 132.699771, 
132.6791708, 132.6854316, 132.6888423, 132.6958107, 132.7075791, 
132.7359368, 132.7008823, 132.7094064, 132.7131504, 132.708409, 
132.7251742, 132.7357202, 132.742711, 132.7600732, 132.7732886, 
132.727009, 132.5486109, 132.5470441, 132.5480779, 132.5586113, 
132.541613, 132.516896, 132.5240977, 132.529158, 132.576234, 
132.5889441, 132.6164585, 132.6411233, 132.6148395, 132.7246135, 
132.6422625, 132.7226471, 132.7363183, 132.7421675, 132.691419, 
132.6641904, 132.665175, 132.661818, 132.6559487, 132.6567813, 
132.6579793, 132.7164382, 132.6956271, 132.6983378, 132.6573281, 
132.6543369, 132.6392342, 132.648159, 132.6758976, 132.6813358, 
132.6640595, 132.7096546, 132.587736, 132.5992522, 132.5865522, 
132.5473471, 132.5728439, 132.5487525, 132.5445001, 132.547059761893, 
132.5538985, 132.556837, 132.5422245, 132.5413181, 132.5451487, 
132.5387885, 132.5249464, 132.513415, 132.4995801, 132.4930108, 
132.5204733, 132.4952279, 132.4331523, 132.4692833, 132.4746544, 
132.5247713, 132.5374035, 132.5444085, 132.5947945, 132.6278865, 
132.7003073, 132.724446383125, 132.7509583, 132.5697286, 132.5031789, 
132.5261544, 132.433172666015, 132.7995, 132.6734, 132.9806, 
132.5196, 132.5454, 132.7716, 132.7272, 132.6548, 132.654852573297, 
132.7008, 132.738540411042), lat = c(33.9564431, 33.9754360356657, 
33.933059, 33.9809751, 33.9329099108545, 33.5110363, 33.378098, 
33.5268887, 33.5509639, 33.5365971, 33.575854, 33.5742345, 33.6157401, 
33.6432153491279, 33.6330822, 33.622899, 33.6338496, 33.7256174, 
33.6549261, 33.6135064, 33.5958084, 33.5802348, 33.5830577, 33.7204944, 
33.6225688, 33.6224810289754, 33.5061936, 33.6027543, 33.651187, 
33.667152, 33.6850872, 33.6879397, 33.70622, 33.6404735, 33.6829081, 
33.7565837, 33.750887, 33.7458008, 33.7331328, 33.7286281, 33.7365333, 
33.7071535, 33.7465765, 33.7467099, 33.7060568, 33.7492625, 33.6732325, 
33.692618, 33.7126021, 33.7205759, 33.7217591, 33.7759301629704, 
33.774027, 33.7372899, 33.7263687, 33.7433889, 33.7534323, 33.7168509, 
33.7228451, 33.754549, 33.7582016, 33.7683532, 33.7761549, 33.7663288, 
33.7690062, 33.7768026, 33.7815469, 33.7863218, 33.801855, 33.5033857, 
33.5442034, 33.4938868, 33.4587104, 33.4522772, 33.4445049, 33.4605537, 
33.4735192, 33.4983482, 33.5126441, 33.5051541, 33.484071, 33.4607175, 
33.5217105, 33.461865, 33.4978878, 33.4847207, 33.5018607, 33.502981, 
33.4882015, 33.521964, 33.511535, 33.5266055, 33.5497139, 33.5328971, 
33.5744565, 33.5492812, 33.5675655, 33.561057, 33.5538189, 33.5993602, 
33.5783269, 33.5976171, 33.6069939, 33.6064045, 33.6441148, 33.6010336, 
33.5800368, 33.523926, 33.516037, 33.533482, 33.5201949, 33.5251998, 
33.5440333757024, 33.5622518, 33.5667931, 33.5634791, 33.593845, 
33.5816383, 33.5713532, 33.579981, 33.5975215, 33.5963463, 33.6126805, 
33.5777769, 33.5926493, 33.5579864, 33.5623084, 33.5889524, 33.5128369, 
33.5181171, 33.5069546, 33.573771, 33.5544736, 33.5249674, 33.5215648578441, 
33.5097935, 33.5181873, 33.5354703, 33.6342814, 33.5579579304105, 
33.64085, 33.53592, 33.62881, 33.44105, 33.46809, 33.49512, 33.57256, 
33.53509, 33.5351158000868, 33.52385, 33.48571)), row.names = c("36855", 
"36856", "36857", "36858", "36859", "36996", "36997", "36998", 
"36999", "37000", "37001", "37002", "37003", "37004", "37005", 
"37006", "37007", "37008", "37009", "37010", "37011", "37012", 
"37013", "37014", "37015", "37016", "37017", "37018", "37019", 
"37020", "37021", "37022", "37023", "37024", "37025", "37026", 
"37027", "37028", "37029", "37030", "37031", "37032", "37033", 
"37034", "37035", "37036", "37037", "37038", "37039", "37040", 
"37041", "37042", "37043", "37044", "37045", "37046", "37047", 
"37048", "37049", "37050", "37051", "37052", "37053", "37054", 
"37055", "37056", "37057", "37058", "37059", "37077", "37078", 
"37079", "37080", "37081", "37082", "37083", "37084", "37085", 
"37086", "37087", "37088", "37089", "37090", "37091", "37092", 
"37093", "37094", "37095", "37096", "37097", "37098", "37099", 
"37100", "37101", "37102", "37103", "37104", "37105", "37106", 
"37107", "37108", "37109", "37110", "37111", "37112", "37113", 
"37114", "37115", "37116", "37117", "37118", "37119", "37120", 
"37121", "37122", "37123", "37124", "37125", "37126", "37127", 
"37128", "37129", "37130", "37131", "37132", "37133", "37134", 
"37135", "37136", "37137", "37138", "37142", "37143", "37144", 
"37145", "37146", "37147", "37148", "37149", "37150", "88530", 
"88537", "94189", "94192", "94193", "94194", "94195", "94196", 
"94197", "94198", "94199"), class = "data.frame")

最佳答案

如果有一种简单的方法可以做到这一点,我没有找到。但有一个困难的方法:find_no 数据帧的每一行都包含边缘的起始和结束坐标。因此,您从第一行开始,找到与它共享坐标的其他行。将第一行标记为多边形 1 的第一个点,然后移动到与其共享一个点的行。您重复此过程,直到回到开始的边缘,并且您已经标记了第一个多边形。现在从第一个可用的未标记行开始重复。继续此过程,直到没有未标记的行为止。

这是一个可以完成所有这些操作的函数:

extract_polygons <- function(alpha_obj)
{
  if(class(alpha_obj) != "ashape") stop("extract_polygons requires an ashape")
  edge.df <- as.data.frame(alpha_obj$edges)
  groups <- ns <- xs <- ys <- numeric(nrow(edge.df))
  m <- cbind(edge.df[[1]], edge.df[[2]])
  group <- 1
  repeat {
    i <- which(groups == 0)[1]
    if (length(i) == 0 | is.na(i)) break()
    j <- n <- 1
    repeat {
      groups[i] <- group
      ns[i] <- n
      if(j == 1) xs[i] <- edge.df$x1[i] else xs[i] <- edge.df$x2[i]
      if(j == 1) ys[i] <- edge.df$y1[i] else ys[i] <- edge.df$y2[i]
      next_ind <- which((m[, j] == m[i, j] | m[, (j %% 2 + 1)] == m[i, j]) & groups == 0)
      if (length(next_ind) == 0) break()
      j <- which(m[next_ind,] == m[i, j]) %% 2 + 1
      i <- next_ind
      n <- n + 1
    }
    group <- group + 1
  }
  data.frame(x = xs, y = ys, group = as.factor(groups))[order(groups, ns), ]
}

因此我们可以从“ashape”对象中提取多边形并在绘图中使用它们。

polygon.df <- extract_polygons(alpha_obj)

ggplot(points.df, aes(lon, lat)) + 
  geom_point() + 
  geom_polygon(data = polygon.df, aes(x, y, fill = group), colour = "black", alpha = 0.5)

enter image description here


更新

在OP给出的数据集上,我使用完全相同的代码得到以下结果,没有错误:

enter image description here

关于r - 从 R 中的 alphahull 中提取多个多边形,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61973754/

相关文章:

r - 计算自上次事件以来耗时

根据 ggplot 中的变量重新排列 x 轴

r - 如何反转ggplot2中y轴值的顺序

r - 使 ggplot 的 strip 文本延伸到轴文本上

引用ggplot函数的 'data'参数中传入的数据框的一个变量

solr - 无法对 Solr 地理空间搜索结果进行排序

MongoDB/PyMongo 地理空间查询 : distance of documents from a point

r - 加快合并 R 中的许多数据帧

使用 dplyr 的 transmute_all 将列值替换为列名称

geometry - 机器学习路径的良好特征