c - C 编译器使用的数据布局(对齐概念)

以下是 red dragon book 的摘录.

Example 7.3. Figure 7.9 is a simplification of the data layout used by C compilers for two machines that we call Machine 1 and Machine 2.

Machine 1 : The memory of Machine 1 is organized into bytes consisting of 8 bits each. Even though every byte has an address, the instruction set favors short integers being positioned at bytes whose addresses are even, and integers being positioned at addresses that are divisible by 4. The compiler places short integers at even addresses, even if it has to skip a byte as padding in the process. Thus, four bytes, consisting of 32 bits, may be allocated for a character followed by a short integer.

Machine 2: each word consists of 64 bits, and 24 bits are allowed for the address of a word. There are 64 possibilities for the individual bits inside a word, so 6 additional bits are needed to distinguish between them. By design, a pointer to a character on Machine 2 takes 30 bits — 24 to find the word and 6 for the position of the character inside the word. The strong word orientation of the instruction set of Machine 2 has led the compiler to allocate a complete word at a time, even when fewer bits would suffice to represent all possible values of that type; e.g., only 8 bits are needed to represent a character. Hence, under alignment, Fig. 7.9 shows 64 bits for each type. Within each word, the bits for each basic type are in specified positions. Two words consisting of 128 bits would be allocated for a character followed by a short integer, with the character using only 8 of the bits in the first word and the short integer using only 24 of the bits in the second word. □

我发现了对齐的概念here , here和 here 。我从他们那里得到的理解如下:

在字可寻址CPU(大小大于一个字节)中，数据对象中引入了某些填充，以便CPU能够以最少的内存周期有效地从内存中检索数据。

现在这里的机器1实际上是一个字节地址1。 机器 1 规范中的条件可能比字大小为 4 字节的简单字可寻址机器更困难。在这样的64位机器中，我们需要确保我们的数据项只是字对齐，不再有困难。但是如何在像机器 1(如上表所示)这样的系统中找到对齐方式，在这些系统中，字对齐的简单概念不起作用，因为它是字节可寻址的并且具有更困难的规范。

此外，我发现很奇怪的是，在 double 行中，类型的大小大于对齐字段中给出的大小。不应该对齐(以位为单位)≥大小(以位为单位)吗？因为对齐是指实际为数据对象分配的内存(？)。

“每个字由 64 位组成，一个字的地址允许使用 24 位。一个字内的各个位有 64 种可能性，因此需要 6 个附加位来区分它们。通过设计中，指向机器 2 上的字符的指针需要 30 位，其中 24 位用于查找单词，6 位用于查找字符在单词中的位置。” -

此外，关于基于对齐的指针概念的陈述应该如何可视化(2^6 = 64，很好，但是这 6 位与对齐概念有何关联)？

最佳答案

首先，机器1一点也不特殊。它就像 x86-32或 32 位 ARM .

Moreover I find it quite weird that in the row for double the size of the type is more than what is given in the alignment field. Shouldn't alignment(in bits) ≥ size (in bits) ? Because alignment refers to the memory actually allocated for the data object (?).

不，这不是真的。对齐意味着对象中最低可寻址字节的地址必须能被给定的字节数整除。

此外，对于C，在数组中，sizeof (ElementType) 也需要大于或等于每个成员的对齐并且sizeof (ElementType)可以被对齐整除，因此脚注a。因此在后一台计算机上:

 struct { char a, b; }

可能具有 16 的大小，因为字符位于不同的可寻址单词中，而

struct { char a[2];  }

可以压缩成8个字节。

how should this statement about the concept of the pointers, based on alignment is to be visualized (2^6 = 64, it is fine but how is this 6 bits correlating with the alignment concept)

至于字符指针，这6位是假的。需要3位来选择8字节字中的8字节之一，因此这是书中的一个错误。普通字节将仅选择 24 位的字，而字符(字节)指针将选择 24 位的字，以及该字内 3 位的 8 位字节之一。

关于c - C 编译器使用的数据布局(对齐概念)，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/66884143/

c - C 编译器使用的数据布局(对齐概念)

上一篇：javascript - 使用 vanilla js 为条形音箱制作动画

下一篇：c# - 将文件上传到 Azure Blob 存储后一小段时间出现 404