c++ - 求和巨大的数字

情况是这样的，我真的不知道把大数加到最后计算所有平均值时到底发生了什么。

如果有具体错误需要编辑，请指正。

我已经调试过了，我只是在数据中找到，我在以下循环中的正常数据，但似乎变量“somme”给了我一些随机数并给出了一些完全错误的东西。 “moyenne”也是如此

别的，所有数据都是，或者0或者正数。 Somme 有时会给出一个负数!

#define Nb 230400
std::vector<std::array<double,480>> data(480);

    double somme=0;
    double moyenne=0;
    for (int i=0;i<480;i++)
    {
        for (int j=0;j<480;j++)
            somme=somme+data[i][j];

    }
    moyenne=somme/Nb;

最佳答案

首先，使用您提供的代码，您无法获得负面结果(至少使用 IEEE float PC 和通常的 Unix 机器)；如果你溢出，你会得到特殊值 Inf(但如果数据是在您指定的范围内)。结果可能是错误的，由于舍入误差，但它们的下限仍为 0。

您还没有具体说明您是如何确定结果的负数，也不是你如何确保输入数据在范围内，所以我只能推测；但以下是不同的可能性:

您在启用优化的情况下进行编译，并且正在寻找在调试器的值。调试器经常会显示查看优化时的错误值(未初始化的内存) 代码。
你在别处有未定义的行为(指针问题)，这破坏了你在这里看到的内存。 99%的时间，这是对其他无法解释的解释行为，但我在这里有点怀疑:只要有您发布的代码序列中没有其他内容，也没有其他线程正在运行，没有指针(至少你操纵)滥用。

您未能正确初始化数据。你可能想要在最里面的循环中添加一个断言，只是为了确定:

    for ( int i = 0; i < 480; ++ i ) {
        for ( int j = 0; j < 480; ++ j ) {
            assert( data[i][j] >= 0.0 && data[i][j] < 200000.0 );
            somme += data[i][j];
        }
    }

For the rest, your algorithm isn't particularly accurate. Some quick tests (filling your data structure with random values in the range [0...2e5)) show less than 15 digits accuracy in the final result. (Of course, this may be acceptable. Most physical data that you acquire won't have more than 3 or 4 digits accuracy anyway, and you may not be displaying more than 6. In which case...)

The accuracy issue is actually curious, and shows just how tricky these things can be. I used three functions for my tests:

//  Basically what you did...
double
av1( std::vector<std::array<double, cols>> const& data )
{
    double somme = 0.0;
    for ( int i = 0; i != data.size(); ++ i ) {
        for ( int j = 0; j != cols; ++j ) {
            somme += data[i][j];
        }
    }
    return somme / (data.size() * cols);
}

//  The natural way of writing it in C++11...
double
av2( std::vector<std::array<double, cols>> const& data )
{
    return std::accumulate( 
        data.begin(),
        data.end(),
        0.0,
        []( double a, std::array<double, cols> const& b ) {
            return a + std::accumulate( b.begin(), b.end(), 0.0 );
        } ) / (data.size() * cols);
}

//  Using the Kahan summation algorithm...
double
av3( std::vector<std::array<double, cols>> const& data )
{
    double somme = 0.0;
    double c = 0.0;
    for ( int i = 0; i != data.size(); ++ i ) {
        for ( int j = 0; j != cols; ++j ) {
            double y = data[i][j] - c;
            double t = somme + y;
            c = (t - somme) - y;
            somme = t;
        }
    }
    return somme / (data.size() * cols);
}

(在所有测试中，cols == 480 和 data.size() == 480。)

代码是使用 VC11 编译的，带有选项/O2。这有趣的是 av2 更系统地比您的代码准确，通常精确到第 17 位数字(2 或内部表示中的 3 位)，其中 av1 在第 15 位数字(8 或内部表示中的 9 位)。这样做的原因是您的代码系统地收集到 xmm1 中，跨越所有 480*480 值，其中 av2 分别收集每一行；这导致添加量减少，但差异很大震级。 (随着 av1 接近数据末尾，somme 接近 2.3e10，它比任何一个都大得多数据元素。)使用类似的东西:

double
moyenne( std::vector<std::array<double, cols>> const& data )
{
    double outerSum = 0.0;
    for ( int i = 0; i != data.size(); ++ i ) {
        double innerSum = 0.0;
        for ( int j = 0; j != cols; ++ j ) {
            innerSum += data[i][j];
        }
        outerSum += innerSum;
    }
    return outerSum / (data.size() * cols);
}

应该给出等同于 av2 的结果。 (但是如果你需要准确性，你真的应该使用 Kahan 求和算法。)

(我很想补充一点，如果其中任何一个让您感到惊讶，您无论如何都不应该使用 float 。)

关于c++ - 求和巨大的数字，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/17543807/

c++ - 求和巨大的数字

上一篇：c++ - 将参数限制为 C++ 中的特定类和派生类

下一篇：c++ - 应用程序之间的共享代码