我有一个距离矩阵:
> mat
hydrogen helium lithium beryllium boron
hydrogen 0.000000 2.065564 3.940308 2.647510 2.671674
helium 2.065564 0.000000 2.365661 1.697749 1.319400
lithium 3.940308 2.365661 0.000000 3.188148 2.411567
beryllium 2.647510 1.697749 3.188148 0.000000 2.499369
boron 2.671674 1.319400 2.411567 2.499369 0.000000
和一个数据框:
> results
El1 El2 Score
Helium Hydrogen 92
Boron Helium 61
Boron Lithium 88
我想计算
results$El1
中单词之间的所有成对距离和 results$El2
得到以下内容:> results
El1 El2 Score Dist
Helium Hydrogen 92 2.065564
Boron Helium 61 1.319400
Boron Lithium 88 2.411567
我用 for 循环做了这个,但它看起来很笨重。有没有更优雅的方法来用更少的代码行搜索和提取距离?
这是我当前的代码:
names = row.names(mat)
num.results <- dim(results)[1]
El1 = match(results$El1, names)
El2 = match(results$El2, names)
el.dist <- matrix(0, num.results, 1)
for (i1 in c(1:num.results)) {
el.dist[i1, 1] <- mat[El1[i1], El2[i1]]
}
results$Dist = el.dist[,1]
最佳答案
cols <- match(tolower(results$El1), colnames(mat))
rows <- match(tolower(results$El2), colnames(mat))
results$Dist <- mat[cbind(rows, cols)]
results
El1 El2 Score Dist
1 Helium Hydrogen 92 2.065564
2 Boron Helium 61 1.319400
3 Boron Lithium 88 2.411567
你会认出大部分代码。要重点关注的是
mat[cbind(rows, cols)]
.对于矩阵,我们可以通过另一个列数与维度相同的矩阵进行子集化。来自 ?`[`
帮助:When indexing arrays by [ a single argument i can be a matrix with as many columns as there are dimensions of x; the result is then a vector with elements corresponding to the sets of indices in each row of i.
关于r - 有效地访问成对距离,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32062408/