c++ - 使用按位运算符相乘

标签 c++ math assembly bit-manipulation multiplication

我想知道如何使用按位运算符将一系列二进制位相乘。但是,我有兴趣这样做来找到二进制值的小数部分值。这是我正在尝试做的一个例子:

给定,比如说:1010010,

我想使用每个单独的位,以便将其计算为:

1*(2^-1) + 0*(2^-2) + 1*(2^-3) + 0*(2^-4)......

虽然我对在 ARM 汇编中执行此操作很感兴趣,但使用 C/C++ 编写的示例仍然会有所帮助。

我正在考虑执行一个带有计数器的循环,每次循环迭代时,计数器都会递增,值将在逻辑上左移以便获取第一位,然后乘以 2^-counter。

但是,我不完全确定如何仅让第一个位/MSB 相乘,而且我很困惑如何将该值乘以基数 2 的某个负幂。

我知道逻辑左移会将它与基数二相乘,但那些通常有正指数。前任。 LSL r0, 2 表示 r0 中的值将乘以 2^2 = 4。

提前致谢!

最佳答案

仅使用按位运算( ANDORXOR<<>> )将两个数字相乘是完全可能的,尽管效率可能不是很高。您可能想阅读有关 Adder (electronics) 的相关维基百科文章。和 Binary multiplier .

自下而上的乘法方法是首先创建一个二进制加法器。对于最低位(位 0)a half adder工作正常。 S 表示求和,C 表示进位。

Half adder

对于其余的位,您需要 full adder对于每一位。 Cin 表示“进位”,Cout 表示“执行”:

Full adder

将多个位相加的最简单逻辑电路称为ripple-carry adder :

Ripple-carry adder

波纹进位加法器基本上是一系列全加器,进位传播到全加器计算下一个更高有效位。确实存在其他更有效的方法,但出于简单的原因,我跳过了它们。所以现在我们有了一个二进制加法器。

二进制乘法器是一个更困难的情况。但我认为这更像是一种概念证明,而不是一种将两个数字相乘的实用方法,所以让我们走一个更简单的弯路。

假设我们要计算 a 的乘积和 b , a = 100 , b = 5 . ab都是 16 位无符号整数(也可以是定点数)。我们可以创建一个加数数组,在其中写入 a 的值(100) b (5) 次,反之亦然。由于 16 位可表示的最高无符号值是 2^16-1 (65535),我们想要创建一个包含 65535 个无符号整数的数组,并用零填充。然后我们需要将数组的 5 个值设置为 100,仅使用按位运算。

我们可以这样做:首先我们用 a_array 的值填充一个数组(我们称之为 a ) (100)。然后我们想将 a_array 中的一些值归零基于 b 的值,所以 b a_array 的值保持不变,a_array 的其余值归零。为此,我们使用二进制掩码和 AND按位运算。

所以我们遍历 b 的位.对于 b 的每一位我们根据 b 中那个位的值创建一个二进制掩码.创建这样的二进制掩码只需要移位( <<>> ),按位 AND和按位 OR .

0 -> 0b0000 0000 0000 0000
1 -> 0b1111 1111 1111 1111

So, now we have a binary mask. But how we use it? Well, the bit 0 of b corresponds to numerical value of 0 or 1. The bit 1 of b corresponds to numerical value 0 or 2. The bit 2 of b corresponds to numerical value of 0 or 4. So bit n of b corresponds to the numerical value of 0 or 2^n. So, as we loop through the bits of b and create a binary mask for each bit, we AND 2^n values of a_array with the corresponding binary mask. The corresponding value in a_array either gets zeroed or stays unmodified. In C code I use a for loop for ANDing through the a_array, together with incrementing and decrementing counters. Increment and decrement are not a bitwise operations. But the for loop is not necessary, it's used only for readability (from human point of view). Actually, I first wrote in x86-64 assembly a 4-bit * 4-bit = 4-bit multiplier to try this concept, using only and, or, xor, shl (bit shift left), and shr (bit shift right) and call. call is function or procedure call, that is, not a bitwise operation, but you can inline all those functions or procedures and thus compute the product using only AND, OR, XOR, << and >>. So instead of a for loop, for each bit of b, you can AND n (n = 1, 2, 4, 8 ...) corresponding values of a_array using the bitwise mask based on the corrsponding bit of b. For a 16-bit * 16-bit = 16-bit multiplication that requires 65535 AND commands (without a loop). Computers have no problem with such an input, but humans tend to have problems reading such code. For that reason a for loop is used.

Now we have a_array filled with b values of a, the rest are zeroes. The rest is simple: we just add all the values of a_array using out bitwise adder (it's the function my_add in the below C code).

Here's the code for 16-bit * 16-bit = 16-bit unsigned integer multiplication. Please note that the function memset16 assumes a little-endian architecture. Converting memset16 to a big-endian architecture should be trivial. The code works for fixed-point multiplication too, you only need to add a bit shift in the end. Converting to different variable sizes as well as implementing overflow detection should be trivial too. Tasks are left for the reader. Compiles with GCC, tested in x86-64 Linux.

#include <stdio.h>
#include <stdint.h>
#include <string.h>

#define NUMBER_OF_BITS 16
#define MAX_VALUE 65535

typedef uint8_t u8;
typedef uint16_t u16;
typedef uint32_t u32;

typedef int8_t s8;
typedef int16_t s16;
typedef int32_t s32;

typedef struct result_struct{
    u16 result;
    u16 carry;
} result_struct;

u16 extend_lowest_bit(u16 a)
{
    /* extends lowest bit (bit 0) to all bits. */
    u16 a_extended;
    a = (a & 1);
    a_extended = a | (a << 1) | (a << 2) | (a << 3) | (a << 4);
    a_extended = a_extended | (a << 5) | (a << 6) | (a << 7) | (a << 8);
    a_extended = a_extended | (a << 9) | (a << 10) | (a << 11) | (a << 12);
    a_extended = a_extended | (a << 13) | (a << 14) | (a << 15);
    return a_extended;
}

result_struct my_add(u16 a, u16 b)
{
    /* computes (a + b). */
    result_struct add_results;

    u16 a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13, a14, a15;
    u16 b0, b1, b2, b3, b4, b5, b6, b7, b8, b9, b10, b11, b12, b13, b14, b15;
    u16 carry, result = 0;
    /* prepare for bitwise addition by separating
     * each bit of summands a and b using bitwise AND. */
    a0 = a & 1;
    a1 = a & (1 << 1);
    a2 = a & (1 << 2);
    a3 = a & (1 << 3);
    a4 = a & (1 << 4);
    a5 = a & (1 << 5);
    a6 = a & (1 << 6);
    a7 = a & (1 << 7);
    a8 = a & (1 << 8);
    a9 = a & (1 << 9);
    a10 = a & (1 << 10);
    a11 = a & (1 << 11);
    a12 = a & (1 << 12);
    a13 = a & (1 << 13);
    a14 = a & (1 << 14);
    a15 = a & (1 << 15);

    b0 = b & 1;
    b1 = b & (1 << 1);
    b2 = b & (1 << 2);
    b3 = b & (1 << 3);
    b4 = b & (1 << 4);
    b5 = b & (1 << 5);
    b6 = b & (1 << 6);
    b7 = b & (1 << 7);
    b8 = b & (1 << 8);
    b9 = b & (1 << 9);
    b10 = b & (1 << 10);
    b11 = b & (1 << 11);
    b12 = b & (1 << 12);
    b13 = b & (1 << 13);
    b14 = b & (1 << 14);
    b15 = b & (1 << 15);

    add_results.result = a0 ^ b0;
    /* result: 0000 0000 0000 000x */
    carry = (a0 & b0) << 1;
    add_results.result = add_results.result | (a1 ^ b1 ^ carry);
    /* result: 0000 0000 0000 00xx */
    carry = ((carry & (a1 ^ b1)) | (a1 & b1)) << 1;
    add_results.result = add_results.result | (a2 ^ b2 ^ carry);
    /* result: 0000 0000 0000 0xxx */
    carry = ((carry & (a2 ^ b2)) | (a2 & b2)) << 1;
    add_results.result = add_results.result | (a3 ^ b3 ^ carry);
    /* result: 0000 0000 0000 xxxx */
    carry = ((carry & (a3 ^ b3)) | (a3 & b3)) << 1;
    add_results.result = add_results.result | (a4 ^ b4 ^ carry);
    /* result: 0000 0000 000x xxxx */
    carry = ((carry & (a4 ^ b4)) | (a4 & b4)) << 1;
    add_results.result = add_results.result | (a5 ^ b5 ^ carry);
    /* result: 0000 0000 00xx xxxx */
    carry = ((carry & (a5 ^ b5)) | (a5 & b5)) << 1;
    add_results.result = add_results.result | (a6 ^ b6 ^ carry);
    /* result: 0000 0000 0xxx xxxx */
    carry = ((carry & (a6 ^ b6)) | (a6 & b6)) << 1;
    add_results.result = add_results.result | (a7 ^ b7 ^ carry);
    /* result: 0000 0000 xxxx xxxx */
    carry = ((carry & (a7 ^ b7)) | (a7 & b7)) << 1;
    add_results.result = add_results.result | (a8 ^ b8 ^ carry);
    /* result: 0000 000x xxxx xxxx */
    carry = ((carry & (a8 ^ b8)) | (a8 & b8)) << 1;
    add_results.result = add_results.result | (a9 ^ b9 ^ carry);
    /* result: 0000 00xx xxxx xxxx */
    carry = ((carry & (a9 ^ b9)) | (a9 & b9)) << 1;
    add_results.result = add_results.result | (a10 ^ b10 ^ carry);
    /* result: 0000 0xxx xxxx xxxx */
    carry = ((carry & (a10 ^ b10)) | (a10 & b10)) << 1;
    add_results.result = add_results.result | (a11 ^ b11 ^ carry);
    /* result: 0000 xxxx xxxx xxxx */
    carry = ((carry & (a11 ^ b11)) | (a11 & b11)) << 1;
    add_results.result = add_results.result | (a12 ^ b12 ^ carry);
    /* result: 000x xxxx xxxx xxxx */
    carry = ((carry & (a12 ^ b12)) | (a12 & b12)) << 1;
    add_results.result = add_results.result | (a13 ^ b13 ^ carry);
    /* result: 00xx xxxx xxxx xxxx */
    carry = ((carry & (a13 ^ b13)) | (a13 & b13)) << 1;
    add_results.result = add_results.result | (a14 ^ b14 ^ carry);
    /* result: 0xxx xxxx xxxx xxxx */
    carry = ((carry & (a14 ^ b14)) | (a14 & b14)) << 1;
    add_results.result = add_results.result | (a15 ^ b15 ^ carry);
    /* result: xxxx xxxx xxxx xxxx */
    add_results.carry = ((carry & (a15 ^ b15)) | (a15 & b15)) << 1;
    return add_results;
}

result_struct add_array(void* array, s32 size)
{
    /* adds together all u16 values of the array. */
    result_struct add_results;
    u16* i;
    u16* top_address;

    add_results.result = 0;
    add_results.carry = 0;

    for (i = array; i < ((u16*)array + size); i++)
    {
        add_results = my_add(add_results.result, *i);
    }
    return add_results;
}

void memset16(void* dest, u16 value, s32 size)
{
    /* does a 16-bit memset. size is the number of u16's (words). */
    u8* i;

    for (i = (u8*)dest; i < ((u8*)dest+(2*size)); i+=2)
    {
        memset(i, (int)(value & 0xff), 1);
        memset(i+1, (int)(value >> 8), 1);
    }  
}

result_struct my_mul(u16 a, u16 b)
{
    /* computes (a * b) */
    u16 bitmask, a_array[MAX_VALUE];
    u32 block_length;
    s16 bit_i;
    s32 count, size;
    u16* i;
    void* p_a_array; 
    p_a_array = a_array;

    result_struct mul_results;
    mul_results.result = 0;

    size = MAX_VALUE;
    memset16(p_a_array, a, size);   // can be replaced with AND.

    /* mask the summands. can be unrolled to
     * use only bitwise operations. */
    i = p_a_array;

    for (bit_i = 0, block_length = 1; bit_i < NUMBER_OF_BITS; bit_i++)
    {
        bitmask = extend_lowest_bit(b >> bit_i);

        for (count = block_length; count > 0; count--)
        {
            *i = (*i & bitmask);
            i++;
        }
        block_length <<= 1;
    }
    /* the array of summands is now masked. */

    /* add the values of the array together. */
    mul_results = add_array(p_a_array, MAX_VALUE);
    return mul_results;
}

int main(void)
{
    int a, b;
    result_struct multiply_results;

    printf("Enter the 1st unsigned 16-bit integer.\n");
    scanf("%d", &a);
    printf("Enter the 2nd unsigned 16-bit integer.\n");
    scanf("%d", &b);
    multiply_results = my_mul((u16)a, (u16)b);
    printf("%d * %d = %d\n", a, b, multiply_results.result);
    return 0;
}

关于c++ - 使用按位运算符相乘,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20340961/

相关文章:

c++ - VxWorks with Eclipse - 缺少头文件

c++ - 函数模板的显式实例化何时发生实例化

javascript - 在圆形雷达数学方法中表示点

c++ - Xcode:为什么仅在静态库中重命名为 .mm 失败并出现 undefined symbol "___gxx_personality_sj0"?

c++ - 函数调用参数中的表达式交错到什么粒度?

algorithm - 如何有效地将整数转换为斐波那契编码?

python - Tkinter、变量和函数

c - 如何在 C 或 Assembly 中修改 Stack 上的返回地址

string - 反转用户在汇编语言中给出的字符串

c - 从给定的 x86 程序集编写 C 函数