我正在尝试尽快计算 R
中的特定总和。 object of interest是
相关输入对象是两个L
乘K
矩阵x
(仅包含正整数)和alpha
(仅包含正实数值)。 A
相当于 rowSums(alpha)
,N
相当于 rowSums(x)
。下标l
和k
分别表示alpha
或x
的行/列。
起初我认为想出一些超快的东西会很容易,但我找不到一个优雅的解决方案。我认为 seq() 的矩阵值版本在这里会非常有帮助。有没有人有一个创造性的解决方案来有效地实现这一点?
这是一个易于阅读但显然效率低下的基于循环的版本供引用:
# parameters
L = 20
K = 5
# x ... L x K matrix of integers
x = matrix(1 : (L * K), L, K)
# alpha ... L x K matrix of positive real numbers
alpha = matrix(1 : (L * K) / 100, L, K)
# N ... sum over rows of x
N = rowSums(x)
# A ... sum over rows of alpha
A = rowSums(alpha)
# implementation
stacksum = function(x, alpha, N, A){
# parameters
K = ncol(x)
L = nrow(x)
result = 0
for(ll in 1:L){
# first part of sum
first.sum = 0
for(kk in 1:K){
# create sequence
sequence.k = seq(alpha[ll, kk], (alpha[ll, kk] + x[ll, kk] - 1), 1)
# take logs and sum
first.sum = first.sum + sum(log(sequence.k))
}
# second part of sum
second.sum = sum(log(seq(A[ll], (A[ll] + N[ll] - 1), 1)))
# add to result
result = result + first.sum - second.sum
}
return(result)
}
# test
stacksum(x, alpha, N, A)
最佳答案
使用基于 @RobertDodier 评论的 lgamma
解决方案更新。
使用sequence
和rep.int
。
# parameters
L <- 20
K <- 5
# x ... L x K matrix of integers
x <- matrix(1 : (L * K), L, K)
# alpha ... L x K matrix of positive real numbers
alpha <- matrix(1 : (L * K) / 100, L, K)
# N ... sum over rows of x
N <- rowSums(x)
# A ... sum over rows of alpha
A <- rowSums(alpha)
# proposed solution
stacksum2 <- function(x, alpha, N, A) {
sum(log(sequence(x, alpha) + rep.int(alpha %% 1, x))) - sum(log(sequence(N, A) + rep.int(A %% 1, N)))
}
# solution from Robert Dodier's comments
stacksum3 <- function(x, alpha, N, A) {
sum(lgamma(alpha + x) - lgamma(alpha)) - sum(lgamma(A + N) - lgamma(A))
}
# OP solution
stacksum1 = function(x, alpha, N, A){
# parameters
K = ncol(x)
L = nrow(x)
result = 0
for(ll in 1:L){
# first part of sum
first.sum = 0
for(kk in 1:K){
# create sequence
sequence.k = seq(alpha[ll, kk], (alpha[ll, kk] + x[ll, kk] - 1), 1)
# take logs and sum
first.sum = first.sum + sum(log(sequence.k))
}
# second part of sum
second.sum = sum(log(seq(A[ll], (A[ll] + N[ll] - 1), 1)))
# add to result
result = result + first.sum - second.sum
}
result
}
res <- list(
stacksum1(x, alpha, N, A),
stacksum2(x, alpha, N, A),
stacksum3(x, alpha, N, A)
)
all.equal(res[1:2], res[-1])
#> [1] TRUE
microbenchmark::microbenchmark(stacksum1 = stacksum1(x, alpha, N, A),
stacksum2 = stacksum2(x, alpha, N, A),
stacksum3 = stacksum3(x, alpha, N, A),
check = "equal")
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> stacksum1 1654.2 1704.60 1899.384 1740.80 1964.75 4234.4 100
#> stacksum2 238.2 246.45 258.284 252.35 268.40 319.4 100
#> stacksum3 18.5 19.05 20.981 20.55 21.70 36.4 100
关于r - 基于 R 中的序列高效计算总和,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/73226191/