考虑以下代码:
#include <iostream>
#include <string>
#include <chrono>
using namespace std;
int main()
{
int iter = 1000000;
int loops = 10;
while (loops)
{
int a=0, b=0, c=0, f = 0, m = 0, q = 0;
auto begin = chrono::high_resolution_clock::now();
auto end = chrono::high_resolution_clock::now();
auto deltaT = end - begin;
auto accumT = end - begin;
accumT = accumT - accumT;
auto controlT = accumT;
srand(chrono::duration_cast<chrono::nanoseconds>(begin.time_since_epoch()).count());
for (int i = 0; i < iter; i++) {
begin = chrono::high_resolution_clock::now();
// No arithmetic operation
end = chrono::high_resolution_clock::now();
deltaT = end - begin;
accumT += deltaT;
}
controlT = accumT; // Control duration
accumT = accumT - accumT; // Reset to zero
for (int i = 0; i < iter; i++) {
auto n1 = rand() % 100;
auto n2 = rand() % 100;
begin = chrono::high_resolution_clock::now();
c += i*2*n1*n2; // Some arbitrary arithmetic operation
end = chrono::high_resolution_clock::now();
deltaT = end - begin;
accumT += deltaT;
}
// Print the difference in time between loop with no arithmetic operation and loop with
cout << " c = " << c << "\t\t" << " | ";
cout << "difference between the 1st and 2nd loop: "
<< chrono::duration_cast<chrono::nanoseconds>(accumT - controlT).count()
<< endl;
loops--;
}
return 0;
}
它试图隔离操作的时间测量。第一个循环是建立基线的控制,第二个循环是任意算术运算。
然后它输出到控制台。这是示例输出:
c = 2116663282 | difference between 1st and 2nd loop: -8620916
c = 112424882 | difference between 1st and 2nd loop: -1197927
c = -1569775878 | difference between 1st and 2nd loop: -5226990
c = 1670984684 | difference between 1st and 2nd loop: 4394706
c = -1608171014 | difference between 1st and 2nd loop: 676683
c = -1684897180 | difference between 1st and 2nd loop: 2868093
c = 112418158 | difference between 1st and 2nd loop: 5846887
c = 2019014070 | difference between 1st and 2nd loop: -951609
c = 656490372 | difference between 1st and 2nd loop: 997815
c = 263579698 | difference between 1st and 2nd loop: 2371088
这是非常有趣的部分:有时带有算术运算的循环比没有算术运算的循环完成更快(负差异)。这意味着记录当前时间的运算比算术运算慢,因此不可忽略。
有解决办法吗?
PS:是的,我知道您可以在begin 和end 之间包装整个循环。
设置机器:Core i7体系结构、Windows 10 64 位和 Visual Studio 2015
最佳答案
您的问题是您测量的是时间,而不是处理的指令数。时间会受到许多并非您真正期望或希望衡量的事物的影响。
相反,您应该测量时钟周期数。可以在 Agner Fog 的网站上找到一个用于此的库。他有很多关于优化的有用信息:
http://www.agner.org/optimize/#manuals
即使使用时钟周期,您仍然可以体验到结果的特殊性。如果处理器使用 out-of-order execution 可能会发生这种情况这使处理器能够优化操作的执行顺序。
如果您使用调试符号编译代码,编译器可能会注入(inject)额外的代码,这可能会影响结果。执行此类测试时,您应该始终在不带调试信息的情况下进行编译。
关于c++ - C++ 中可忽略的计时测量?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35743569/