c++ - 使用 Rcpp/RcppArmadillo : Can't identify exactly what is wrong after debugging with valgrind 运行我的 C++ 代码时出现内存错误

我刚刚开始研究 Rcpp，我正在尝试实现 PRIM 的算法。在大量帮助和一些阅读之后，我有一个运行良好的版本，除了 n=50 或 n=1050(w/seed 1984)的模拟数据。

我的 RStudio 抛出“R session 中止”屏幕。从终端(我使用的是 Linux Mint 18.3)我得到

**** Error in `/usr/lib/R/bin/exec/R': double free or corruption (!prev): 0x00000000039f8690 ***

在寻找如何调试编译后的代码后，我发现:

@Dirk Eddelbuettel 对 gdb 的解释: http://dirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part5_packaging.pdf
来自@Kevin Ushey 的一些 Material 解释了 valgrind: http://kevinushey.github.io/blog/2015/04/05/debugging-with-valgrind/

我也读过 lldb，但决定使用 valgrind。

我准备了一个 cod_valgrind_test.R 文件，我在其中生成数据并编译我的 C++ 文件。所有这些文件都在我的 github 存储库 ( https://github.com/allanvc/test ) 中，但我决定在此处重现有问题的文件 (prim_cpp_bug.cpp) 的代码:

#include <iostream>
//#include <Rcpp.h>
#include <RcppArmadillo.h>

//using namespace Rcpp;
//using namespace arma;
using namespace std;


// [[Rcpp::depends(RcppArmadillo)]]


// [[Rcpp::export]]
Rcpp::List prim_cpp(arma::mat x)
{


    int V = x.n_cols;

    arma::uvec parent(V);
    parent.at(0) = 0;

    double max_value = x.max()+1;

    int v = 0;

    int idxmin_geral = 0;

    arma::uvec min_subnot;

    arma::mat new_m;

    arma::uvec from(V-1);
    //from.at(0) = 0;

    arma::uvec to(V-1);



    for(int i=0; i < V; i++)
    {
        // "deleting" the row for current vertex by setting the maximum to all entries in this row
        x.row(v).fill(max_value); // better than using loop

        // insert object x.col(v) at col i of new_m matrix
        new_m.insert_cols(i,x.col(v)); //see arma.sourceforge.net 

        //cout << new_m << endl;

        // obtain the minimum index from the selected columns
        idxmin_geral = new_m.index_min();

        // obtain the subscript notation from index based on reduced dimensions ***
        min_subnot = arma::ind2sub(arma::size(new_m.n_rows, new_m.n_cols),
                                    idxmin_geral);
        // *** adapted from @coatless
        // https://stackoverflow.com/questions/48045895/how-to-find-the-index-of-the-minimum-value-between-two-specific-columns-of-a-mat                                    

        v = min_subnot.at(0);
        parent.at(i+1) = v; // -----> !! this is line 61 <-----

        to.at(i) = min_subnot.at(0); // -----> !! this is line 63 <-----
        from.at(i) = parent.at(min_subnot.at(1)); // -----> !! this is line 64 <-----

        // "deleting" the row for current vertex by setting maximum to all entries in this row
        // now, in the new matrix
        new_m.row(v).fill(max_value); //better than using loop

    }
    /*
     * add 1 to the final vectors - preparing R output
    */
    return Rcpp::List::create(
    Rcpp::Named("dist",x),
    Rcpp::Named("parent",parent),
    Rcpp::Named("from",from+1),
    Rcpp::Named("to",to+1)
    );
}

在我的测试中，奇怪的是只有 n=50 和 n=1050 valgrind 在从我的 prim_cpp_bug.cpp 文件执行函数 prim_cpp() 时向我显示 3 个错误。

从终端返回运行 R -d valgrind -f cod_valgrind_test.R:

==36904== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)

有问题的行似乎是:

61、63 和 64

==36904== Invalid write of size 4 ==36904== at 0x12315F5D: prim_cpp(arma::Mat) (prim_cpp_bug.cpp:61)

(...)

==36904== Invalid write of size 4 ==36904== at 0x12315F63: prim_cpp(arma::Mat) (prim_cpp_bug.cpp:63)

(...)

==36904== Invalid write of size 4 ==36904== at 0x12315F74: prim_cpp(arma::Mat) (prim_cpp_bug.cpp:64)

似乎我为我的 vector 做了一些错误的内存分配 - 可能是在使用索引时。我想我读过一些人不推荐使用 .at() 但我不确定。由于我没有足够的 C++ 和 Rcpp/RcppArmadillo 知识来解决这个问题，我将不胜感激任何帮助。

我的 sessionInfo() 如下:

R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 18.3

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=pt_BR.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=pt_BR.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=pt_BR.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_3.4.3 tools_3.4.3    yaml_2.1.16

最佳答案

由于 parent 是用 V 元素初始化的，因此索引在最后一次迭代中越界，其中 i+1 将是V 第 61 行(因为索引从 0 开始)。

它不一定在所有情况下都出错这一事实不足为奇，因为在许多情况下，代码无论如何都会设法从内存中收集一些随机的东西。所以幸运的是有几个错误，否则结果可能只是错误而没有人注意到......

关于c++ - 使用 Rcpp/RcppArmadillo : Can't identify exactly what is wrong after debugging with valgrind 运行我的 C++ 代码时出现内存错误，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/48225516/

c++ - 使用 Rcpp/RcppArmadillo : Can't identify exactly what is wrong after debugging with valgrind 运行我的 C++ 代码时出现内存错误

上一篇：c++ - 当派生实例不是指针时基类的异构容器

下一篇：c++ - 是否可以查看内置函数定义？