有人可以向我解释一下使用 Rcpp 编写的简单 for 循环的奇怪行为吗(代码如下)。根据微基准测试输出,for_iteration
的算法复杂度似乎是恒定的,但根据其代码,这并非如此。为了进行比较,我测试了函数 for_double_iteration
,它的行为与其代码复杂性一致。此代码在 Ubuntu 16.04 和 CPU Intel Core i3-6100 上运行,但在 Windows 7 和 CPU Intel Core i5-2300 上获得了相同的结果。
代码如下:
library(Rcpp)
library(microbenchmark)
sourceCpp(code='
#include <Rcpp.h>
// [[Rcpp::export]]
int for_iteration(const int n) {
int j = 0;
for (int i = 0; i < n; i++) {
j++;
}
return (j);
}
// [[Rcpp::export]]
int for_double_iteration(const int n) {
int j = 0, k = 0;
for (int i = 0; i < n; i++) {
for (k = 0; k < i; k++) {
j++;
}
}
return (j);
}'
)
这里是基准:
> microbenchmark(for_iteration(10^5),
for_iteration(10^6),
for_iteration(10^7),
for_iteration(10^8),
for_iteration(10^9),
times=1000, unit="us")
Unit: microseconds
expr min lq mean median uq max neval
for_iteration(10^5) 1.254 1.379 1.552229 1.4305 1.5255 9.197 1000
for_iteration(10^6) 1.268 1.379 1.724993 1.4300 1.5410 123.822 1000
for_iteration(10^7) 1.274 1.377 3.687182 1.4240 1.5075 2126.909 1000
for_iteration(10^8) 1.253 1.387 1.546345 1.4360 1.5320 9.527 1000
for_iteration(10^9) 1.265 1.386 1.568382 1.4300 1.5230 20.307 1000
> microbenchmark(for_double_iteration(10^2),
for_double_iteration(10^3),
for_double_iteration(10^4),
for_double_iteration(10^5),
for_double_iteration(10^6),
times=1000, unit="us")
Unit: microseconds
expr min lq mean median uq max neval
for_double_iteration(10^2) 0.921 1.0230 1.516304 1.1100 1.4335 23.308 1000
for_double_iteration(10^3) 1.722 1.7970 2.491999 1.8915 2.2270 49.772 1000
for_double_iteration(10^4) 9.022 9.1165 9.947209 9.2050 9.6925 55.841 1000
for_double_iteration(10^5) 82.170 82.2700 86.240153 82.3590 82.9070 1959.903 1000
for_double_iteration(10^6) 813.723 814.4450 834.870686 826.4625 828.2280 1178.062 1000
最佳答案
我的猜测是正确的,它被优化掉并替换为j = n
。
查看 godbolt ,使用编译器标志 -O2
,您将获得汇编输出(gas):
for_iteration(int):
test edi, edi
mov eax, 0
cmovns eax, edi
ret
糟糕...根本没有循环!!
关于c - Rcpp 中单个 "for"循环的意外性能,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41409841/