c++ - 分配时的内存损坏

标签 c++ memory memory-corruption

我在为我的数据结构之一分配内存时遇到问题。它总是会崩溃,但并不总是在同一个地方。我怀疑我试图将它分配在已经存在的东西之上,但我真的不确定如何判断实际发生了什么或如何修复它 - 我尝试安装 valgrind,但是那尚不支持 Mac OS 10.10。

这是调用该函数的代码。

stet::file f1;
f1.set_path("test/longfile1.txt"); // a file with almost 2 million lines
f1.read();

std::string all_text = f1.get_contents();

std::vector<chunk *> chunks = populate_chunks(all_text);

这些是我的数据结构 - 这个想法是将文件中的文本分割成固定大小的 block ,这些 block 填充最多 75% 的容量,但我似乎无法创建所有 block 。

struct line {
    std::string text;
};

struct chunk {
    line *lines[MAX_CHUNK_SIZE];
};

这就是我噩梦的原因 - 它在所有评论下方的行中崩溃。

std::vector<chunk *> populate_chunks(std::string &text) {

    std::vector<std::string> all_lines; 
    boost::split(all_lines, text, boost::is_any_of("\n"));
    size_t num_lines = all_lines.size();

    std::vector<chunk *> chunks = std::vector<chunk *>( (num_lines / START_CHUNK_SIZE) * 2 );

    size_t next_line_num;

    for(size_t line_num = 0; line_num < num_lines; line_num = next_line_num) {
        next_line_num = line_num + START_CHUNK_SIZE;

        std::cout << line_num << std::endl;

        chunk *c = new chunk;
        chunks.push_back(c);

        // This always falls over, but not always at the same point in the file.
        // Never seems to be the first time. Observed range: 3072 - 59904
        // Error always looks something like this:
        // text(71184,0x7fff77699300) malloc: *** error for object 0x7ff389006208: incorrect checksum for freed object - object was probably modified after being freed.
        // *** set a breakpoint in malloc_error_break to debug

        for(size_t i = 0; i < next_line_num; ++i) {
            line *l = new line;
            l->text = all_lines[line_num+i];
            c->lines[i] = l;
        }
    }

    return chunks;
}

如果有人有任何想法,他们将不胜感激 - 应该指出的是,我对 C++ 还很陌生,所以很可能我错过了一些非常愚蠢的东西。

更新:

我根据收到的评论修改了代码:

  • 对函数的返回值进行分块,而不是指针
  • 停止在创建时为 all_lines vector 指定大小,允许 boost 对其进行排序
  • 我还启动并运行了一个 Fedora 虚拟机,以便将其通过 valgrind,但我对输出感到非常困惑。
  • 记下下面的 MAX_CHUNK_SIZE 和 START_CHUNK_SIZE 值。

宏值:

#define MAX_CHUNK_SIZE 1024
#define START_CHUNK_SIZE MAX_CHUNK_SIZE * 0.75

上述更改后的 Valgrind 输出:

==24468== Memcheck, a memory error detector
==24468== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==24468== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==24468== Command: bin/text
==24468== 
==24468== Invalid write of size 8
==24468==    at 0x402907: populate_chunks(std::string&) (text_storage.cc:125)
==24468==    by 0x402ADF: main (text_storage.cc:173)
==24468==  Address 0x216b5640 is 0 bytes after a block of size 8,192 alloc'd
==24468==    at 0x4C27965: operator new(unsigned long) (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==24468==    by 0x402888: populate_chunks(std::string&) (text_storage.cc:113)
==24468==    by 0x402ADF: main (text_storage.cc:173)
==24468== 
==24468== 
==24468== Process terminating with default action of signal 11 (SIGSEGV)
==24468==  Access not within mapped region at address 0x37D77000
==24468==    at 0x402907: populate_chunks(std::string&) (text_storage.cc:125)
==24468==    by 0x402ADF: main (text_storage.cc:173)
==24468==  If you believe this happened as a result of a stack
==24468==  overflow in your program's main thread (unlikely but
==24468==  possible), you can try to increase the size of the
==24468==  main thread stack using the --main-stacksize= flag.
==24468==  The main thread stack size used in this run was 8388608.
==24468== 
==24468== HEAP SUMMARY:
==24468==     in use at exit: 371,641,698 bytes in 6,241,143 blocks
==24468==   total heap usage: 6,241,190 allocs, 47 frees, 656,880,685 bytes allocated
==24468== 
==24468== 16 bytes in 2 blocks are possibly lost in loss record 1 of 11
==24468==    at 0x4C27965: operator new(unsigned long) (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==24468==    by 0x4028BC: populate_chunks(std::string&) (text_storage.cc:123)
==24468==    by 0x402ADF: main (text_storage.cc:173)
==24468== 
==24468== 43 bytes in 1 blocks are possibly lost in loss record 2 of 11
==24468==    at 0x4C27965: operator new(unsigned long) (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==24468==    by 0x5340048: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib64/libstdc++.so.6.0.19)
==24468==    by 0x5341900: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib64/libstdc++.so.6.0.19)
==24468==    by 0x5341D37: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) (in /usr/lib64/libstdc++.so.6.0.19)
==24468==    by 0x4029F7: main (text_storage.cc:138)
==24468== 
==24468== 35,727,800 (33,173,592 direct, 2,554,208 indirect) bytes in 4,146,699 blocks are definitely lost in loss record 8 of 11
==24468==    at 0x4C27965: operator new(unsigned long) (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==24468==    by 0x4028BC: populate_chunks(std::string&) (text_storage.cc:123)
==24468==    by 0x402ADF: main (text_storage.cc:173)
==24468== 
==24468== 93,350,023 bytes in 1 blocks are possibly lost in loss record 9 of 11
==24468==    at 0x4C27965: operator new(unsigned long) (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==24468==    by 0x5340048: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib64/libstdc++.so.6.0.19)
==24468==    by 0x5340235: std::string::_M_mutate(unsigned long, unsigned long, unsigned long) (in /usr/lib64/libstdc++.so.6.0.19)
==24468==    by 0x53403C5: std::string::_M_leak_hard() (in /usr/lib64/libstdc++.so.6.0.19)
==24468==    by 0x5340412: std::string::begin() (in /usr/lib64/libstdc++.so.6.0.19)
==24468==    by 0x407728: boost::range_iterator<std::string>::type boost::range_detail::range_begin<std::string>(std::string&) (begin.hpp:49)
==24468==    by 0x40705D: boost::range_iterator<std::string>::type boost::range_adl_barrier::begin<std::string>(std::string&) (begin.hpp:108)
==24468==    by 0x4066FC: __gnu_cxx::__normal_iterator<char*, std::string> boost::iterator_range_detail::iterator_range_impl<__gnu_cxx::__normal_iterator<char*, std::string> >::adl_begin<std::string>(std::string&) (iterator_range_core.hpp:58)
==24468==    by 0x40601A: boost::iterator_range<__gnu_cxx::__normal_iterator<char*, std::string> >::iterator_range<std::string>(std::string&, boost::iterator_range_detail::range_tag) (iterator_range_core.hpp:207)
==24468==    by 0x40561F: boost::iterator_range<boost::range_iterator<std::string>::type> boost::make_iterator_range<std::string>(std::string&) (iterator_range_core.hpp:559)
==24468==    by 0x404BC3: boost::iterator_range<boost::range_iterator<std::string>::type> boost::range_detail::make_range<std::string>(std::string&, long) (as_literal.hpp:93)
==24468==    by 0x4040E5: boost::iterator_range<boost::range_iterator<std::string>::type> boost::as_literal<std::string>(std::string&) (as_literal.hpp:102)
==24468== 
==24468== 93,351,904 bytes in 1 blocks are possibly lost in loss record 10 of 11
==24468==    at 0x4C27965: operator new(unsigned long) (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==24468==    by 0x5340048: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib64/libstdc++.so.6.0.19)
==24468==    by 0x5341710: char* std::string::_S_construct<char*>(char*, char*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib64/libstdc++.so.6.0.19)
==24468==    by 0x531F9A7: std::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::str() const (in /usr/lib64/libstdc++.so.6.0.19)
==24468==    by 0x402523: stet::file::read() (file.cc:50)
==24468==    by 0x402A2E: main (text_storage.cc:139)
==24468== 
==24468== 129,441,960 bytes in 1,520,226 blocks are possibly lost in loss record 11 of 11
==24468==    at 0x4C27965: operator new(unsigned long) (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==24468==    by 0x5340048: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib64/libstdc++.so.6.0.19)
==24468==    by 0x40845E: char* std::string::_S_construct<__gnu_cxx::__normal_iterator<char*, std::string> >(__gnu_cxx::__normal_iterator<char*, std::string>, __gnu_cxx::__normal_iterator<char*, std::string>, std::allocator<char> const&, std::forward_iterator_tag) (basic_string.tcc:138)
==24468==    by 0x4082E8: char* std::string::_S_construct_aux<__gnu_cxx::__normal_iterator<char*, std::string> >(__gnu_cxx::__normal_iterator<char*, std::string>, __gnu_cxx::__normal_iterator<char*, std::string>, std::allocator<char> const&, std::__false_type) (basic_string.h:1725)
==24468==    by 0x408177: char* std::string::_S_construct<__gnu_cxx::__normal_iterator<char*, std::string> >(__gnu_cxx::__normal_iterator<char*, std::string>, __gnu_cxx::__normal_iterator<char*, std::string>, std::allocator<char> const&) (basic_string.h:1746)
==24468==    by 0x407FDA: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string<__gnu_cxx::__normal_iterator<char*, std::string> >(__gnu_cxx::__normal_iterator<char*, std::string>, __gnu_cxx::__normal_iterator<char*, std::string>, std::allocator<char> const&) (basic_string.tcc:229)
==24468==    by 0x407D6A: std::string boost::copy_range<std::string, boost::iterator_range<__gnu_cxx::__normal_iterator<char*, std::string> > >(boost::iterator_range<__gnu_cxx::__normal_iterator<char*, std::string> > const&) (iterator_range_core.hpp:643)
==24468==    by 0x407AEA: boost::algorithm::detail::copy_iterator_rangeF<std::string, __gnu_cxx::__normal_iterator<char*, std::string> >::operator()(boost::iterator_range<__gnu_cxx::__normal_iterator<char*, std::string> > const&) const (util.hpp:97)
==24468==    by 0x407395: boost::transform_iterator<boost::algorithm::detail::copy_iterator_rangeF<std::string, __gnu_cxx::__normal_iterator<char*, std::string> >, boost::algorithm::split_iterator<__gnu_cxx::__normal_iterator<char*, std::string> >, boost::use_default, boost::use_default>::dereference() const (transform_iterator.hpp:121)
==24468==    by 0x406B72: boost::transform_iterator<boost::algorithm::detail::copy_iterator_rangeF<std::string, __gnu_cxx::__normal_iterator<char*, std::string> >, boost::algorithm::split_iterator<__gnu_cxx::__normal_iterator<char*, std::string> >, boost::use_default, boost::use_default>::reference boost::iterator_core_access::dereference<boost::transform_iterator<boost::algorithm::detail::copy_iterator_rangeF<std::string, __gnu_cxx::__normal_iterator<char*, std::string> >, boost::algorithm::split_iterator<__gnu_cxx::__normal_iterator<char*, std::string> >, boost::use_default, boost::use_default> >(boost::transform_iterator<boost::algorithm::detail::copy_iterator_rangeF<std::string, __gnu_cxx::__normal_iterator<char*, std::string> >, boost::algorithm::split_iterator<__gnu_cxx::__normal_iterator<char*, std::string> >, boost::use_default, boost::use_default> const&) (iterator_facade.hpp:514)
==24468==    by 0x40633D: boost::iterator_facade<boost::transform_iterator<boost::algorithm::detail::copy_iterator_rangeF<std::string, __gnu_cxx::__normal_iterator<char*, std::string> >, boost::algorithm::split_iterator<__gnu_cxx::__normal_iterator<char*, std::string> >, boost::use_default, boost::use_default>, std::string, boost::forward_traversal_tag, std::string, long>::operator*() const (iterator_facade.hpp:639)
==24468==    by 0x405895: void std::vector<std::string, std::allocator<std::string> >::_M_range_initialize<boost::transform_iterator<boost::algorithm::detail::copy_iterator_rangeF<std::string, __gnu_cxx::__normal_iterator<char*, std::string> >, boost::algorithm::split_iterator<__gnu_cxx::__normal_iterator<char*, std::string> >, boost::use_default, boost::use_default> >(boost::transform_iterator<boost::algorithm::detail::copy_iterator_rangeF<std::string, __gnu_cxx::__normal_iterator<char*, std::string> >, boost::algorithm::split_iterator<__gnu_cxx::__normal_iterator<char*, std::string> >, boost::use_default, boost::use_default>, boost::transform_iterator<boost::algorithm::detail::copy_iterator_rangeF<std::string, __gnu_cxx::__normal_iterator<char*, std::string> >, boost::algorithm::split_iterator<__gnu_cxx::__normal_iterator<char*, std::string> >, boost::use_default, boost::use_default>, std::input_iterator_tag) (stl_vector.h:1188)
==24468== 
==24468== LEAK SUMMARY:
==24468==    definitely lost: 33,173,592 bytes in 4,146,699 blocks
==24468==    indirectly lost: 2,554,208 bytes in 319,276 blocks
==24468==      possibly lost: 316,143,946 bytes in 1,520,231 blocks
==24468==    still reachable: 19,769,952 bytes in 254,937 blocks
==24468==         suppressed: 0 bytes in 0 blocks
==24468== Reachable blocks (those to which a pointer was found) are not shown.
==24468== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==24468== 
==24468== For counts of detected and suppressed errors, rerun with: -v
==24468== ERROR SUMMARY: 4146708 errors from 7 contexts (suppressed: 2 from 2)

最佳答案

lines 的大小chunk 中的数组是 MAX_CHUNK_SIZE ,但是您对它的访问远远超出了除第一次之外的任何迭代中的范围。 你的循环是 for(size_t i = 0; i < next_line_num; ++i) ,你猜怎么着next_line_num是在你的第二次(及以后)迭代中吗?

如果您想到了另一个被您忽略的问题,您可能会完全避免这个问题。您只填充了部分块(75%),这是有道理的。但在最后一次迭代中,您的行数可能比填充 75% 的 block 所需的行数还要少。因此,应该在某个地方有一个测试来处理这个边界。在该循环中的某个位置与 num_lines 进行比较。考虑将其放在哪里可能(但不一定会)提醒您迭代索引没有按照您的预期进行。

尝试for(size_t i = 0; i < START_CHUNK_SIZE && line_num+i < num_lines; ++i) .

关于c++ - 分配时的内存损坏,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26547437/

相关文章:

c++ - 为什么 "SHOW PROCESSLIST"不给 Host IP?

c++ - glibc 上的 valgrind 输出在 C++ 中检测到错误

python - 如何从损坏的原始文件中提取 jpeg?

c++ - 谷歌模拟单元测试

.net - 非托管代码中的 Windows 窗体?

c++ - 链接器错误 : "LNK2019: Unresolved external symbol"

C++将指针递增到未知的内存区域

memory - 如何查看/报告 Windows Azure CPU 和内存使用情况?

c - 为什么将我的输入分配给枚举会导致段错误?

c - 涉及指针的意外 C 行为