c++ - 非恢复浮点平方根算法

我正在尝试使用非还原算法来计算 float 的平方根。

例如，假设 x = 1001，平方根是 31.6386

我想计算这个平方根 使用非还原方法。

我试过按照论文中的方法:

Implementation of Single Precision Floating Point Square Root on FPGAs

但我的结果似乎略有偏差 1 位。不过我不知道为什么。

例如，我在下面编写的程序将产生以下结果:

correct_result =
  41FD1BD2

myresult =  
  41FD1BD1

error =    
    1.192093e-007

代码的 C++ 版本:

#include <iostream>
#include <cmath>

using namespace std;

  union newfloat{
    float f;
    int i;
  };

int main () {
// Input number
newfloat x;
cout << "Enter Number: ";
cin >> x.f;

// Pull out exponent and mantissa
int exponent = (x.i >> 23) & 0xFF;
int mantissa = (x.i & 0x7FFFFF) | ((exponent && exponent) << 23);

// Calculate new exponent
int new_exponent = (exponent >> 1) + 63 + (exponent & 1);


// Shift right (paper says shift left but shift left doesn't work?)
if (exponent & 1) {
    mantissa = mantissa  >> 1;
    cout << " Shifted right " << endl;
}

// Create an array with the bits of the mantissa
unsigned int D [48];
for (int i = 47; i >= 0; i--) {
  if (i >= 24) {
    D[i] = (mantissa >> (i-24)) & 1;
  } else {
    D[i] = 0;
  }
}


// == Perform square root ==
// Set q24 = 0, r24 = 0 and then iterate from k = 23 to 0
int q[25] = {0}; // 25 element array, indexing ends at 24
int r[25] = {0};

for (int k = 23; k >= 0; k--) {
    if (r[k+1] >= 0) {
        r[k] = ((r[k+1] << 2) | (D[2*k+1] << 1) | D[2*k] ) - (q[k+1] << 2 | 1 );
        } else {
        r[k] = ((r[k+1] << 2) | (D[2*k+1] << 1) | D[2*k] ) + (q[k+1] << 2 | 0x3 );
        } 

    if (r[k] >= 0) {
        q[k] = (q[k+1] << 1) | 1;
        } else {
        q[k] = q[k+1] << 1;
    }

    if (k == 0) {
        if (r[0] < 0) {
            r[0] = r[0] + (q[0] << 1) | 1;
        }
    }
}

// Create quotient from LSBs of q[]
int Q = 0;
for (int i = 0; i <= 23; i++) {
    Q = Q | ((q[i] & 1) << i);
}

// Option 1 Rounding
//if (r[0] > 0) // Works for 10, 1001, 1021, but not 1012
// Q = Q + 1;

// Option 2 Rounding (No rounding)
// Works for 1012, Doesn't work for 10, 1001, 1021

// Option 3 Rounding (Calculate the next 3 Quotient bits to get a guard round and sticky bit)

// Calculate correct result:
newfloat correct_result;
correct_result.f = sqrt(x.f);

// Form my result into a single number
newfloat myresult;
myresult.i = (new_exponent << 23) | (Q & 0x7FFFFF);

// Print results
cout << hex << "My result: " << myresult.i << endl;
cout << hex << "Correct:   " <<  correct_result.i << endl;
return 0;
}

最佳答案

首先让我强调论文中的相关部分:

algorithm

您需要再看看加法/减法是如何完成的。您的代码以常规双数执行它，但我认为该算法是用整数设计的 modular arithmetic记在心里。

因此，如果您查看本文后面列出的示例，0011 - 0101 的计算将返回 1110。

example

这可以解释为什么你得到错误的结果，我想 :)

关于c++ - 非恢复浮点平方根算法，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/26535375/

c++ - 非恢复浮点平方根算法

上一篇：c++ - 模板参数和它自己的模板参数都包含同名类型；如何在函数签名中引用第二个？

下一篇：c++ - 是否可以创建一个不会通过枚举找到的 win32 消息传递窗口？