c++ - 编译器何时会优化 C/C++ 源代码中的汇编代码？

标签 c++ c assembly optimization inline-assembly

<分区>

关闭。这个问题是opinion-based .它目前不接受答案。

想要改进这个问题？ 更新问题，以便 editing this post 可以用事实和引用来回答它.

关闭 6 年前。

社区在 1 年前审查了是否重新打开这个问题，然后将其关闭:

原始关闭原因未解决

Improve this question

大多数编译器不优化内联汇编代码(VS2015，gcc)，它允许我们编写它不支持的新指令。

但是 C/C++ 编译器什么时候应该实现内联汇编优化？

最佳答案

从来没有。这将违背内联汇编的目的，即准确获得您所要求的内容。

如果您想以编译器可以理解和优化的方式使用 objective-c PU 指令集的全部功能，您应该使用内部函数，而不是内联 asm.

例如而不是 popcnt 的内联汇编, 使用 int count = __builtin_popcount(x); (在使用 -mpopcnt 编译的 GNU C 中)。 Inline-asm 也是特定于编译器的，所以如果有任何内在函数更可移植，特别是如果您使用英特尔的 x86 内在函数，所有可以针对 x86 的主要编译器都支持它。使用 #include <x86intrin.h>你可以使用 int _popcnt32 (int a)可靠地获得 popcnt x86指令。参见 Intel's intrinsics finder/guide ，以及 x86 中的其他链接标记维基。

int count(){ 
  int total = 0;
  for(int i=0 ; i<4 ; ++i)
    total += popc(i);
  return total;
}

编译自#define popc _popcnt32通过 gcc6.3:

    mov     eax, 4
    ret

clang 3.9 with an inline-asm definition of popc , on the Godbolt compiler explorer :

    xor     eax, eax
    popcnt  eax, eax
    mov     ecx, 1
    popcnt  ecx, ecx
    add     ecx, eax
    mov     edx, 2
    popcnt  edx, edx
    add     edx, ecx
    mov     eax, 3
    popcnt  eax, eax
    add     eax, edx
    ret

这是内联 asm 击败常量传播的经典示例，如果可以避免，为什么不应该使用它来提高性能:https://gcc.gnu.org/wiki/DontUseInlineAsm .

这是我用于此测试的内联汇编定义:

int popc_asm(int x) {
  // force use of the same register because popcnt has a false dependency on its output, on Intel hardware
  // this is just a toy example, though, and also demonstrates how non-optimal constraints can lead to worse code
  asm("popcnt %0,%0" : "+r"(x));
  return x;
}

如果你不知道 popcnt has a false dependency on its output register on Intel hardware ，这是您应尽可能将其留给编译器的另一个原因。

使用编译器不知道的特殊指令是内联汇编的一个用例，但如果编译器不知道它，它肯定无法优化它。在编译器擅长优化内在函数(例如 SIMD 指令)之前，这种事情的内联 asm 更为常见。但我们现在已经过了很多年，编译器通常对内在函数很好，即使对于像 ARM 这样的非 x86 架构也是如此。

关于c++ - 编译器何时会优化 C/C++ 源代码中的汇编代码？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/41294779/

上一篇：c - 在 C99 中空白初始化结构数组

下一篇：c - 4195808这个号码是怎么回事？

相关文章：

assembly - 使用 gdb 的堆栈地址

c++ - 如何触发数据成员的复制构造函数？

c++ - 我在这里违反了 OOP 设计指南吗？几个有趣的设计泡菜

C溢出还是什么？

c - 如何仅在配置脚本中启用时使用库

assembly - 减法后 SBB RCX、RCX 的进位标志的使用

c++ - 没有指针的递归结构？ (霍夫曼)

c++ - boost::shared_ptr<string> 标准集

c - scanf 在 Visual Studio 2015 中跳过行

assembly - MIPS跳转指令编码: why left shifted,为什么保留PC的高4位？