我试图使用以下代码(实际的片段)从 SQLite 数据库文件中解析 header :
struct Header_info {
char *filename;
char *sql_string;
uint16_t page_size;
};
int read_header(FILE *db, struct Header_info *header)
{
assert(db);
uint8_t sql_buf[100] = {0};
/* load the header */
if(fread(sql_buf, 100, 1, db) != 1) {
return ERR_SIZE;
}
/* copy the string */
header->sql_string = strdup((char *)sql_buf);
/* verify that we have a proper header */
if(strcmp(header->sql_string, "SQLite format 3") != 0) {
return ERR_NOT_HEADER;
}
memcpy(&header->page_size, (sql_buf + 16), 2);
return 0;
}
以下是我正在测试的文件的相关字节:
0000000: 5351 4c69 7465 2066 6f72 6d61 7420 3300 SQLite format 3.
0000010: 1000 0101 0040 2020 0000 c698 0000 1a8e .....@ ........
已关注 this规范,代码对我来说看起来是正确的。
后来我用这一行打印header->page_size
:
printf("\tPage size: %"PRIu16"\n", header->page_size);
但是该行打印出 16,而不是预期的 4096。为什么?我几乎可以肯定这是我刚刚忽略的一些基本事情。
最佳答案
这是一个字节顺序问题。 x86是little-endian,即在内存中,首先存储最低有效字节。当您将 10 00
加载到小端架构上的内存中时,您会得到人类可读形式的 00 10
,即 16
4096
。
因此,您的问题是 memcpy
不是读取该值的合适工具。
请参阅 SQLite file format spec 的以下部分:
1.2.2 Page Size
The two-byte value beginning at offset 16 determines the page size of the database. For SQLite versions 3.7.0.1 and earlier, this value is interpreted as a big-endian integer and must be a power of two between 512 and 32768, inclusive. Beginning with SQLite version 3.7.1, a page size of 65536 bytes is supported. The value 65536 will not fit in a two-byte integer, so to specify a 65536-byte page size, the value is at offset 16 is 0x00 0x01. This value can be interpreted as a big-endian 1 and thought of is as a magic number to represent the 65536 page size. Or one can view the two-byte field as a little endian number and say that it represents the page size divided by 256. These two interpretations of the page-size field are equivalent.
关于c - 读取 SQLite header ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/18678931/