c - 为什么 (1) ["abcd"] +"efg"-'b' +1 变成 "fg"？

#include <stdio.h>
int main()
{
    printf("%s", (1)["abcd"]+"efg"-'b'+1);
}

谁能解释一下为什么这段代码的输出是:

fg

我知道 (1)["abcd"] 指向 "bcd" 但为什么 +"efg"-'b'+1 甚至是一个有效的语法？

最佳答案

I know (1)["abcd"] points to "bcd"

没有。 (1)["abcd"]是单个字符 ( b )。

所以 (1)["abcd"]+"efg"-'b'+1是:'b' + "egf" - 'b' + 1如果你简化它，它就变成了"efg" + 1 .因此它打印 fg .

注意:以上答案仅解释了观察到的行为，根据 C 语言规范，这在严格意义上是不合法的。原因如下。

案例 1: 'b' < 0或 'b' > 4

在这种情况下，表达式 (1)["abcd"] + "efg" - 'b' + 1将导致 undefined behaviour , 由于子表达式 (1)["abcd"] + "efg" ，即 'b' + "efg"生成无效的指针表达式(C11，6.5.5 乘法运算符 -- 引述如下)。

关于广泛使用ASCII字符集，'b'是98十进制；在不太广泛使用的EBCDIC字符集，'b'是130十进制。所以子表达式 (1)["abcd"] + "efg"使用这两者之一的系统会导致未定义的行为。

所以除非有一个奇怪的架构，'b' <= 4 and 'b' >= 0 ，这个程序会导致未定义的行为，因为 C语言定义:

C11, 5.1.2.3 程序执行

The semantic descriptions in this International Standard describe the behavior of an abstract machine in which issues of optimization are irrelevant. [...] In the abstract machine, all expressions are evaluated as specified by the semantics. An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced.

明确指出整个标准是根据抽象机器的行为定义的。

所以在这种情况下，它确实会导致未定义的行为。

案例 2: 'b' >= 0或 'b' <= 4 (这很虚构，但理论上是可能的)。

在这种情况下，子表达式 (1)["abcd"] + "efg"可以是有效的(反过来，整个表达式 (1)["abcd"] + "efg" - 'b' + 1 )。

字符串文字 "efg"由 4 个字符组成，它是一个数组类型(在 C 中为 char[N] 类型)并且 C 标准保证(如上所述)指针表达式计算到数组末尾后一个不会溢出或导致未定义的行为。

以下是可能的子表达式，它们是有效的: (1) "efg"+0 (2) "efg"+1 (3) "efg"+2 (4) "efg"+3和 (5) "efg"+4因为 C 标准规定:

C11，6.5.5 乘法运算符

When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

所以在这种情况下不会导致未定义的行为。

_{谢谢 @zch & @Keith Thompson用于挖掘 C 标准的相关部分 :)}

关于c - 为什么 (1) ["abcd"] +"efg"-'b' +1 变成 "fg"？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/19472155/

c - 为什么 (1) ["abcd"] +"efg"-'b' +1 变成 "fg"？

上一篇：asp.net-mvc - ASP.NET MVC 与 XSL

下一篇：r - 使用耦合列从宽到长 : Is there a more R way to do this (i. e。 - 不使用 for 循环)？