Calloc 返回的地址导致我的程序发生段错误

首先，这是我见过的最奇怪的错误。我不知道发生了什么事。任何人都可以就正在发生的事情提供任何帮助，我们将不胜感激。

我正在编写一个 C 程序，该程序将文件读入动态分配的 block 并在这些 block 上执行各种操作(加密/解密/MACing 等......)当我在某些(更大的？)文件上运行它时它会出现段错误。我想我一定是在某个地方踩到了内存，或者没有正确分配内存。所以我在 valgrind 中运行它，试图弄清楚发生了什么，问题就消失了。 Valgrind 没有报告任何错误，程序也没有出现段错误并按预期运行。

normal run
./threefizer -e -p 1234567 new_name.docx
...
[1]    25017 segmentation fault  ./threefizer -e -p 1234567 new_name.docx

valgrind run
valgrind ./threefizer -e -p 1234567 new_name.docx
==25238== Memcheck, a memory error detector
==25238== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==25238== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==25238== Command: ./threefizer -e -p 1234567 new_name.docx
==25238== 
...
Threefizer operation complete
==25238== 
==25238== HEAP SUMMARY:
==25238==     in use at exit: 0 bytes in 0 blocks
==25238==   total heap usage: 25 allocs, 25 frees, 977,995 bytes allocated
==25238== 
==25238== All heap blocks were freed -- no leaks are possible
==25238== 
==25238== For counts of detected and suppressed errors, rerun with: -v
==25238== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

这是怎么回事？我之前看到 valgrind 防止段错误只是因为它在自己的虚拟机中以不同方式分配内存。但总是在 valgrind 报告至少 1 个某种错误之前发生这种情况。

深入挖掘 gdb 我发现当我的程序试图修改从读取的文件填充的 block 的内容时，段错误总是发生。

我的文件读取功能看起来像这样有人看到这里有什么问题吗？从功能上看，它似乎很好。

uint8_t* readBlock(const uint64_t data_size, const FILE* read)
{
    pdebug("readBlock()\n");
    if(ferror(read))
    {
        fclose(read);
        perror("Error reading block\n");
        return NULL;
    }

    const uint8_t* data = calloc(data_size, sizeof(uint8_t));
    const uint64_t size = fread(data, sizeof(uint8_t), data_size, read);

    if(ferror(read))
    {
        fclose(read);
        perror("Error reading block\n");
        return NULL;
    }

    if(size == data_size)
    {
        return data;
    }

    perror("Unable to read requested number of bytes\n");
    free(data);
    return NULL;
}

一个 gdb session 显示导致我的程序出现段错误的读取。

***Before read***
queueFile()
Breakpoint 1, queueFile (args=0x7fffffffe460, out=0x62c030)
at controller.c:172
//this is where the data is being read and allocated
172                       data_chunk->data = pad(readBlock(orig_file_size, read),
(gdb) print data_chunk->data
$1 = (uint64_t *) 0x0 //the brand new pointer is null as it should be with nothing allocated

***After read***
(gdb) next
readBlock()
176                       data_chunk->data_size = getPadSize(orig_file_size, args>state_size);
(gdb) print data_chunk->data
$4 = (uint64_t *) 0xfffffffff7ee7010 //why is this memory address so big?
(gdb) print data_chunk->data[0]
Cannot access memory at address 0xfffffffff7ee7010 //can't access memory WTF?
//When this pointer is getting passed to anything else that attempts to access or modify it then the program segfaults

当程序试图对来自 readBlock() 的数据执行任何操作时，它会出现段错误。在这种情况下，它会尝试对其进行加密。

***Interestingly the address that is causing the program to segfault is 12 hex digits instead of the regular 6 why?***
Program received signal SIGSEGV, Segmentation fault.
0x0000000000402413 in cbc512Encrypt (key=0x62be78 <runThreefizer.tf_key>, 
iv=0x62c2b0, plain_text=0xfffffffff7ee7010, num_blocks=7611) at cbc.c:206
206     plain_text[0] ^= iv[0]; plain_text[1] ^= iv[1]; plain_text[2] ^= iv[2]; plain_text[3] ^= iv[3];

使用 gdb，当我尝试访问 readBlock() 中的相同内存时，我可以很好地访问它，并且它包含正在读取的文件的正确内容。

***GDB session showing the readBlock() that runs before segfault***
(gdb) continue
Continuing.
readBlock()

Breakpoint 2, readBlock (data_size=487073, read=0x62c360) at fileIO.c:40
40          return data;
(gdb) print data
$7 = (uint8_t *) 0x7ffff7f5e010 "PK\003\004\024" //again whats with the giant memory address all the other ones are only 6 hex digits
(gdb) print size
$8 = 487073
(gdb) print data[0] //as you can see we can access the data just fine and its contents correspond to the first 8 characters of the file that was read
$9 = 80 'P'
(gdb) print data[1]
$10 = 75 'K'
(gdb) print data[2]
$11 = 3 '\003'
(gdb) print data[3]
$12 = 4 '\004'
(gdb) print data[4]
$13 = 20 '\024'
(gdb) print data[5]
$14 = 0 '\000'
(gdb) print data[6]
$15 = 6 '\006'
(gdb) print data[7]
$16 = 0 '\000'
(gdb) 

***The same memory address is inaccessible from the function that calls readBlock()***
(gdb) break controller.c:173
Breakpoint 3 at 0x4038fa: file controller.c, line 173.
(gdb) continue
Continuing.

Breakpoint 3, queueFile (args=0x7fffffffe460, out=0x62c030)
at controller.c:176
176                       data_chunk->data_size = getPadSize(orig_file_size, args->state_size);
(gdb) print data_chunk->data
$17 = (uint64_t *) 0xfffffffff7ee7010 //this is the same address as data in readBlock() and also returned by readBlock()
(gdb) print data_chunk->data[0] 
Cannot access memory at address 0xfffffffff7ee7010 //WHY NOT!?

你知道了吗，有人知道发生了什么事吗？

最佳答案

我能猜猜吗？在调用 readBlock() 的地方，您是否确定包含了 readBlock 的原型(prototype)？我问的原因是，在 readBlock 内部，您的 data 似乎有一个地址 0x7ffff7f5e010 但在调用者中它似乎已更改为类似于 0xfffffffff7ee7010。如果调用者认为 readBlock 正在返回一个 int(即没有原型(prototype))，就会发生这种情况。

我从我的评论中复制了这个，因为它解决了你的问题。顺便说一下，+1 是为数不多的使用调试器并试图用它来隔离问题的人之一。

关于Calloc 返回的地址导致我的程序发生段错误，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/27261032/

Calloc 返回的地址导致我的程序发生段错误

上一篇：c - 访问传递给函数的结构成员变量

下一篇：c - OpenGL/GLUT : glTranslatef and glRotatef before drawing cube, 或之后？