R:K 均值聚类与社区检测算法(加权相关网络)- 我是否将这个问题过于复杂？

我有如下所示的数据:https://imgur.com/a/1hOsFpF
第一个数据集是标准格式数据集，其中包含人员及其财务属性的列表。
第二个数据集包含这些人之间的“关系”——他们互相支付了多少，以及他们彼此欠了多少。
我有兴趣了解更多关于网络和基于图的聚类 - 但我试图更好地了解什么类型的情况需要基于网络的聚类，即我不想在不需要的地方使用图聚类(避免“方钉圆孔"类型情况)。
使用 R，首先我创建了一些假数据:

library(corrr)
 library(dplyr) 
library(igraph) 
library(visNetwork)
 library(stats)

# create first data set

Personal_Information <- data.frame(

"name" = c("John", "Jack", "Jason", "Jim", "Julian", "Jack", "Jake", "Joseph"),

"age" = c("41","33","24","66","21","66","29", "50"),

"salary" = c("50000","20000","18000","66000","77000","0","55000","40000"),

"debt" = c("10000","5000","4000","0","20000","5000","0","1000"

)


Personal_Information$age = as.numeric(Personal_Information$age)
Personal_Information$salary = as.numeric(Personal_Information$salary)
Personal_Information$debt = as.numeric(Personal_Information$debt)
create second data set
Relationship_Information <-data.frame(

"name_a" = c("John","John","John","Jack","Jack","Jack","Jason","Jason","Jim","Jim","Jim","Julian","Jake","Joseph","Joseph"),
"name_b" = c("Jack", "Jason", "Joseph", "John", "Julian","Jim","Jim", "Joseph", "Jack", "Julian", "John", "Joseph", "John", "Jim", "John"),
"how_much_they_owe_each_other" = c("10000","20000","60000","10000","40000","8000","0","50000","6000","2000","10000","10000","50000","12000","0"),
"how_much_they_paid_each_other" = c("5000","40000","120000","20000","20000","8000","0","20000","12000","0","0","0","50000","0","0")
)

Relationship_Information$how_much_they_owe_each_other = as.numeric(Relationship_Information$how_much_they_owe_each_other)
Relationship_Information$how_much_they_paid_each_other = as.numeric(Relationship_Information$how_much_they_paid_each_other)

然后，我运行了一个标准的 K-Means 聚类算法(在第一个数据集上)并绘制了结果:

# Method 1 : simple k means analysis with 2 clusters on Personal Information dataset
cl <- kmeans(Personal_Information[,c(2:4)], 2)
plot(Personal_Information, col = cl$cluster)
points(cl$centers, col = 1:2, pch = 8, cex = 2)

这就是我通常会如何处理这个问题。现在，我想看看我是否可以对此类问题使用图聚类。
首先，我创建了一个加权相关网络( http://www.sthda.com/english/articles/33-social-network-analysis/136-network-analysis-and-manipulation-using-r/ )
首先，我创建了加权相关网络(使用第一个数据集):

res.cor <- Personal_Information[, c(2:4)] %>%  
    t() %>% correlate() %>%            
    shave(upper = TRUE) %>%            
    stretch(na.rm = TRUE) %>%          
  filter(r >= 0.8)       

graph <- graph.data.frame(res.cor, directed=F)
graph <- simplify(graph)
plot(graph)

然后，我运行了图聚类算法:

#run graph clustering (also called communiy dectection) on the correlation network
 fc <- fastgreedy.community(graph)
 V(graph)$community <- fc$membership
 nodes <- data.frame(id = V(graph)$name, title = V(graph)$name, group = V(graph)$community)
 nodes <- nodes[order(nodes$id, decreasing = F),]
 edges <- get.data.frame(graph, what="edges")[1:2]

 visNetwork(nodes, edges) %>%
     visOptions(highlightNearest = TRUE, nodesIdSelection = TRUE)

这似乎有效 - 但我不确定这是否是解决这个问题的最佳方式。
有人可以提供一些建议吗？我把这个问题复杂化了吗？
谢谢

最佳答案

也许您可能有兴趣阅读“基于融合的社区检测方法”(https://link.springer.com/chapter/10.1007/978-3-030-44584-3_24)。这些基于融合的方法显然是专门设计来考虑节点属性的。
这也可能有帮助:https://www.nature.com/articles/srep30750

关于R:K 均值聚类与社区检测算法(加权相关网络)- 我是否将这个问题过于复杂？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/64849921/

R:K 均值聚类与社区检测算法(加权相关网络)- 我是否将这个问题过于复杂？

上一篇：vagrant - Big Sur MacOS Vagrant 问题

下一篇：reactjs - 找不到所需的文件。 - 将 TypeScript 添加到 React 项目