c - 为什么我在代码中看到不一致之处？

Closed. This question needs details or clarity。它当前不接受答案。












                            想改善这个问题吗？添加详细信息并通过editing this post阐明问题。

                        2年前关闭。



受“只有40亿个浮点数-对其全部进行测试”的启发，我编写了以下代码：

#include<stdio.h>
#include<limits.h>
#include<math.h>
int main(){
    unsigned int i = 0;
    float        f = 0;
    float        r = 1;
    for(i = 0; i < UINT_MAX ; i++){
        f = *(float *)&i;
        r = pow(sin(f),2)+pow(cos(f),2);
        if (fabs(r-1)> 1/pow(2.0,20)){
            printf("hex value 0x%.8X bad value  0x%.8X \n",
                   *(unsigned int *)&f,*(unsigned int *)&r);
        }
        else {
            printf("hex value 0x%.8X good value  0x%.8X \n",
                   *(unsigned int *)&f,*(unsigned int *)&r);
        }
    }
    return 0;
}

我运行了它（实际上只运行了几秒钟，并且运行了OpenMP的并行版本），然后我将输出grep了。每千个浮子中约有1个失败！

math.h有多好？

我正在使用MacOS High Sierra。

[编辑]要澄清：

在阅读了评论和答案之后，我谦虚地承认我的问题不是一个很好的问题，反映出使用C（2000 loc）非常绿色。为了澄清起见，我使用的是C编译器（用于sinf与sin注释），并且对代码进行了“消毒”以在此处发布。
阅读答案后，我将假设任何挥之不去的不稳定性都是由于我的错误造成的。以下是我的新代码，供您参考，其中包含您的一些建议：

/usr/local/Cellar/gcc/7.2.0/bin/g++-7 -gdwarf-3 -Wall -Wshadow -g -DDEBUG -O0 floattest.c -o test.out -lm  -fopenmp

#include<stdio.h>
#include<limits.h>
#include<math.h>
#include<omp.h>
int main(){
    unsigned int i = 0;
    float        f = 0;
    float        r = 1;
#pragma omp parallel for private(i)
    for(i = 0; i < UINT_MAX ; i++){
        f = *(float *)&i;
        r = pow(sinf(f),2)+pow(cosf(f),2);
        if (fabs(r-1)> 1/pow(2.0,5)){
            printf("hex value 0x%.8X. Dec value %.10e. Bad value  0x%.8X \n",
                   *(unsigned int *)&f,f,*(unsigned int *)&r);
        }
    }
    return 0;
}

最佳答案

您正在尝试调查单精度浮点数量违反三角标识 $\sin^2 x + \cos^2 x = 1$ 的频率和数量（对不起，所以没有MathJax）。让我演示一个更好的方法：

为避免浪费时间并避免double rounding错误，应注意仅使用单精度数学；特别是，使用sinf和cosf，而不是sin和cos；
您不应该使用pow[f](x, 2)来平方x，因为x*x会更快，更准确（除非编译器可以将pow[f](x,2)优化为x*x，这不是您应该依靠的东西）。
您不应将时间浪费在“非数字”值上；
您应该使用C99函数nextafter和ldexp而不是类型调整-除了不违反类型混淆规则之外，这还具有更快的机会。
最重要的是，您不应为“错误”设置任意阈值。相反，您应该调查错误的分布。

这就是我的做法。（注意：使用OpenMP，OpenMP 4.5阵列精简和大量C99功能。）

#include <float.h>
#include <inttypes.h>
#include <limits.h>
#include <math.h>
#include <omp.h>
#include <stdio.h>
#include <string.h>

#if FLT_RADIX != 2 || FLT_MIN_EXP != -125 || FLT_MAX_EXP != 128 || !defined INFINITY
#error "This program assumes IEEE-compliant binary floating point"
#endif

// Including subnormals but *not* including infinities and NaNs, IEEE
// singles can have exponents in the range [-149, 127].  Add two more
// for exact zero and overflow.
#define FLT_FULL_EXP_RANGE 279
#define FLT_FULL_EXP_BIAS  148

static inline void
bin_error_in_trig_equality(float x, unsigned long *error_hist)
{
  float s, c;
  int elog;

  s = sinf(x);
  c = cosf(x);
  elog = ilogbf(1.0f - (s*s + c*c));
  if (elog == FP_ILOGB0)
    error_hist[0]++;
  else if (elog == INT_MAX)
    error_hist[FLT_FULL_EXP_RANGE - 1]++;
  else
    error_hist[elog + FLT_FULL_EXP_BIAS]++;
}

int
main(void)
{
  // uint32_t is enough, because we are distributing fewer than
  // 2**32 values over these bins.
  uint32_t error_hist[FLT_FULL_EXP_RANGE] = { 0 };

  // Zero must be tested as a special case.
  bin_error_in_trig_equality(0.0f, error_hist);

  // Coarse grained parallel exhaustive test: each thread is responsible
  // for scanning the full mantissa range for one value of the exponent.
  // This covers all finite values except zero.
  #pragma omp parallel for reduction(+:error_hist)
  for (int i = -149; i <= 127; i++) {
    float lo = ldexpf(1.0f, i);
    float hi = ldexpf(1.0f, i+1);
    for (float x = lo; x < hi; x = nextafterf(x, hi)) {
      bin_error_in_trig_equality(x, error_hist);
      bin_error_in_trig_equality(-x, error_hist);
    }
  }

  // Dump the histogram in a convenient format for graphing.
  printf("bin,count\n");
  if (error_hist[0])
    printf("0.0,%"PRIu32"\n", error_hist[0]);
  for (int i = 1; i < FLT_FULL_EXP_RANGE; i++) {
    if (error_hist[i])
      printf("2^%d,%"PRIu32"\n", i - FLT_FULL_EXP_BIAS, error_hist[i]);
  }
  return 0;
}

当我运行该程序时（在x86-64上，使用GNU libc提供的数学库
2.24），这是它的输出，其中各列手动对齐：

bin,        count
0.0,   3633564531
2^-24,  553412596
2^-23,   91212952

[摘自Eric Postpischil]使用Apple LLVM 8.1.0，macOS 10.12.6和MacBookPro13,3运行相同的源将产生：

bin         count
0.0,   3744628459
2^-24,  431565466
2^-23,  101996154

错误总是在错误中并非偶然。
2-24或2-23格；它们对应于IEEE单精度尾数的最低两位。

该表的解释是：对于所有可能的有限单精度值中的约85％，同一性是精确的；对于另外的13％左右，错误不超过一个unit in the last place（通常缩写为“ ulp”）；对于其余2.1％的值，误差不超过4 ulps。

我认为这些对于通用数学库来说是可接受的错误率。 Transcendental functions are intrinsically hard to compute accurately;您需要使用任意精度的数学运算来对每个值进行完美舍入（而具有工业实力的计算机代数引擎将做到这一点）。

顺便说一句，该程序在当前一代但不是基于Intel的顶级笔记本电脑上仅需要12秒的挂钟时间，因此，自2014年以来观察到的“ There are only four billion floats, test them all”才变得更加真实。

关于c - 为什么我在代码中看到不一致之处？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/47397098/

上一篇：c++ - 静态本地与本地。 C 和 C++ 哪个更快？

下一篇：c - 为什么这段代码是正确的？

c - 如果在 OpenMP 中共享指针(以及指向...的指针)，将共享什么

c++ - TBB 数据对象是否可用于 OpenMP？

c - 为什么预处理器的这个输出看起来没有意义？

c - 在文件中间添加字符而不覆盖C中现有的字符

c - 如果该值存在于 C 中的数组函数中，则添加该值的数量

c++ - 如何在 std::cout 中既不以科学形式也不以定点表示法输出 float ？

c++ - C 与 C++ 调试器中的浮点表示法 (CLI)

c - 获取结构体数组内结构体的值

c - 优化嵌套数组值的代码