c - Rcpp 中单个 "for"循环的意外性能

有人可以向我解释一下使用 Rcpp 编写的简单 for 循环的奇怪行为吗(代码如下)。根据微基准测试输出，for_iteration 的算法复杂度似乎是恒定的，但根据其代码，这并非如此。为了进行比较，我测试了函数 for_double_iteration，它的行为与其代码复杂性一致。此代码在 Ubuntu 16.04 和 CPU Intel Core i3-6100 上运行，但在 Windows 7 和 CPU Intel Core i5-2300 上获得了相同的结果。

代码如下:

library(Rcpp)
library(microbenchmark)
 sourceCpp(code='
  #include <Rcpp.h>

  // [[Rcpp::export]]
  int for_iteration(const int n) {
    int j = 0;
    for (int i = 0; i < n; i++) {
      j++;
    }
    return (j);
  }

  // [[Rcpp::export]]
  int for_double_iteration(const int n) {
    int j = 0, k = 0;
    for (int i = 0; i < n; i++) {
      for (k = 0; k < i; k++) {
        j++;
      }
    }
    return (j);
  }'
)

这里是基准:

> microbenchmark(for_iteration(10^5),
                 for_iteration(10^6),
                 for_iteration(10^7),
                 for_iteration(10^8),
                 for_iteration(10^9),
                 times=1000, unit="us")
Unit: microseconds
                expr   min    lq     mean median     uq      max neval
 for_iteration(10^5) 1.254 1.379 1.552229 1.4305 1.5255    9.197  1000
 for_iteration(10^6) 1.268 1.379 1.724993 1.4300 1.5410  123.822  1000
 for_iteration(10^7) 1.274 1.377 3.687182 1.4240 1.5075 2126.909  1000
 for_iteration(10^8) 1.253 1.387 1.546345 1.4360 1.5320    9.527  1000
 for_iteration(10^9) 1.265 1.386 1.568382 1.4300 1.5230   20.307  1000

> microbenchmark(for_double_iteration(10^2),
                for_double_iteration(10^3),
                for_double_iteration(10^4),
                for_double_iteration(10^5),
                for_double_iteration(10^6),
                times=1000, unit="us")
Unit: microseconds
                       expr     min       lq       mean   median       uq      max neval
 for_double_iteration(10^2)   0.921   1.0230   1.516304   1.1100   1.4335   23.308  1000
 for_double_iteration(10^3)   1.722   1.7970   2.491999   1.8915   2.2270   49.772  1000
 for_double_iteration(10^4)   9.022   9.1165   9.947209   9.2050   9.6925   55.841  1000
 for_double_iteration(10^5)  82.170  82.2700  86.240153  82.3590  82.9070 1959.903  1000
 for_double_iteration(10^6) 813.723 814.4450 834.870686 826.4625 828.2280 1178.062  1000

最佳答案

我的猜测是正确的，它被优化掉并替换为j = n。

查看 godbolt ，使用编译器标志 -O2，您将获得汇编输出(gas):

for_iteration(int):
        test    edi, edi
        mov     eax, 0
        cmovns  eax, edi
        ret

糟糕...根本没有循环!!

关于c - Rcpp 中单个 "for"循环的意外性能，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/41409841/

c - Rcpp 中单个 "for"循环的意外性能

上一篇：c - 'free(): invalid next sign (fast)' 错误的真正含义是什么？

下一篇：c - 使用 Pthreads 进行并行编程