c - 配置文件 C 执行

因此，出于乐趣和好奇心，我想看看在进行奇偶校验、模数或按位比较时执行速度更快的是什么。

因此，我提出了以下内容，但我不确定它的行为是否正确，因为差异非常小。我在网上某处读到，按位应该比模数检查快一个数量级。

它是否有可能被优化掉？我刚刚开始研究汇编，否则我会尝试稍微剖析一下可执行文件。

编辑 3:这是一个工作测试，非常感谢@phonetagger:

#include <stdio.h>
#include <time.h>
#include <stdint.h>

// to reset the global
static const int SEED = 0x2A;

// 5B iterations, each
static const int64_t LOOPS = 5000000000;

int64_t globalVar;

// gotta call something
int64_t doSomething( int64_t input )
{
  return 1 + input;
}

int main(int argc, char *argv[]) 
{
  globalVar = SEED;

  // mod  
  clock_t startMod = clock();

  for( int64_t i=0; i<LOOPS; ++i )
  {    
    if( ( i % globalVar ) == 0 )
    {
      globalVar = doSomething(globalVar);      
    }    
  }

  clock_t endMod = clock();

  double modTime = (double)(endMod - startMod) / CLOCKS_PER_SEC;

  globalVar = SEED;

  // bit
  clock_t startBit = clock();

  for( int64_t j=0; j<LOOPS; ++j )
  {
    if( ( j & globalVar ) == 0 )
    {
      globalVar = doSomething(globalVar);
    }       
  }

  clock_t endBit = clock();

  double bitTime = (double)(endBit - startBit) / CLOCKS_PER_SEC;

  printf("Mod: %lf\n", modTime);
  printf("Bit: %lf\n", bitTime);  
  printf("Dif: %lf\n", ( modTime > bitTime ? modTime-bitTime : bitTime-modTime ));
}

每个循环的 50 亿次迭代，全局删除编译器优化产生以下结果:

Mod: 93.099101
Bit: 16.701401
Dif: 76.397700

最佳答案

gcc foo.c -std=c99 -S -O0 (注意，我专门做了 -O0) for x86 给了我两个循环的相同程序集。接线员strength reduction意味着两个 if 都使用 andl 来完成工作(这比 Intel 机器上的模数更快):

第一个循环:

.L6:
        movl    72(%esp), %eax
        andl    $1, %eax
        testl   %eax, %eax
        jne     .L5
        call    doNothing
.L5:
        addl    $1, 72(%esp)
.L4:
        movl    LOOPS, %eax
        cmpl    %eax, 72(%esp)
        jl      .L6

第二个循环:

.L9:
        movl    76(%esp), %eax
        andl    $1, %eax
        testl   %eax, %eax
        jne     .L8
        call    doNothing
.L8:
        addl    $1, 76(%esp)
.L7:
        movl    LOOPS, %eax
        cmpl    %eax, 76(%esp)
        jl      .L9

您看到的微小差异可能是因为 clock 的分辨率/不准确性。

关于c - 配置文件 C 执行，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/11565663/

c - 配置文件 C 执行

上一篇：c - 在 32 位机器上实现 64 位算法

下一篇：c - 将模块移植到更新的 Linux 内核 : Cannot allocate memory