assembly - POPCNT是如何在硬件中实现的?

标签 assembly x86 hardware

根据http://www.agner.org/optimize/instruction_tables.pdfPOPCNT 指令(返回 32 位或 64 位寄存器中设置的位数)在现代 Intel 和 AMD 处理器上的吞吐量为每个时钟周期 1 条指令。这比任何需要多条指令的软件实现要快得多 ( How to count the number of set bits in a 32-bit integer? )。

POPCNT 是如何在硬件中如此高效地实现的?

最佳答案

有一项组合 popcnt、位扫描正向/反向的专利:

US8214414 B2 - Combined set bit count and detector logic

Abstract

A merged datapath for PopCount and BitScan is described. A hardware circuit includes a compressor tree utilized for a PopCount function, which is reused by a BitScan function (e.g., bit scan forward (BSF) or bit scan reverse (BSR)). Selector logic enables the compressor tree to operate on an input word for the PopCount or BitScan operation, based on a microprocessor instruction. The input word is encoded if a BitScan operation is selected. The compressor tree receives the input word, operates on the bits as though all bits have same level of significance (e.g., for an N-bit input word, the input word is treated as N one-bit inputs). The result of the compressor tree circuit is a binary value representing a number related to the operation performed (the number of set bits for PopCount, or the bit position of the first set bit encountered by scanning the input word).

关于assembly - POPCNT是如何在硬件中实现的?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28802692/

相关文章:

assembly - 在除以零的中断处理程序中该怎么做?

c - 类似于 K&R 书中的哈希函数

android - setParameters 异常...无法使用相机

python - 如何以可由 HAL 检索但不需要安装或更改标签的方式标记设备

c - 我如何简化/压缩代码(如果可能)?

assembly - 现在 x86 上有多少指令?

assembly - 用DOS显示数字

delphi - 串口连接超过6个设备有限制吗?

assembly - masm 程序生成的 .lst 文件的内容是什么?

c++ - c++ by-ref参数传递如何在汇编中编译?