c++ - 为什么 unsigned char 具有与其他数据类型不同的默认初始化行为?

标签 c++ char initialization undefined-behavior

我正在阅读有关默认初始化的 cppreference 页面,我注意到一个部分说明了以下内容:

//UB
int x;
int y = x;        
   
//Defined and ok
unsigned char c;
unsigned char d = c;
无符号字符的相同规则也适用于 std::byte 。
我的问题是,如果您在分配之前尝试使用该值(如上例),而不是 unsigned char,那么为什么所有其他非类变量(int、bool、char 等)都会导致 UB?为什么 unsigned char 很特别?
The page I am reading for reference

最佳答案

区别不在于初始化行为。 uninitialised int 的值是不确定的,默认初始化使它不确定。未初始化的 unsigned char 的值是不确定的,默认初始化使它不确定。那里没有区别。
不同之处在于产生 int 类型的不确定值的行为 - 或除异常 unsigned char 或 std::byte 之外的任何其他类型 - 是未定义的(除非该值被丢弃)。unsigned char 的异常(exception)情况(以及后来的 std::byte )在正确定义不确定值时被添加到 C++14 的语言中(尽管由于更改是一个缺陷解决方案,据我了解它适用于当时的官方标准,C++11 )。
我找不到该设计选择的文件依据。以下是定义的时间表(所有标准引述均来自草稿):

C89 - 1.6 DEFINITIONS OF TERMS

Undefined behavior --- behavior, upon use of ... indeterminately-valued objects


C89 - 3.5.7 Initialization - Semantics

... If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.


任何类型都没有异常(exception)。在阅读 C++98 标准时,您会明白为什么 C 标准是相关的。

C++98 - [dcl.init]

... Otherwise, if no initializer is specified for an object, the object and its subobjects, if any, have an indeterminate initial value


对于不确定值的含义或使用它时会发生什么,没有定义。预期的含义可能与 C89 相同,但未详细说明。

C99 - 3. Terms, definitions, and symbols - 3.17.2

3.17.2 indeterminate value

either an unspecified value or a trap representation

3.17.3 unspecified value

valid value of the relevant type where this International Standard imposes no requirements on which value is chosen in any instance

NOTE An unspecified value cannot be a trap representation.


C99 - 6.2.6 Representations of types - 6.2.6.1 General

Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined. 41) Such a representation is called a trap representation.


C99 - J.2 Undefined behavior

The behavior is undefined in the following circumstances:

  • ...
  • The value of an object with automatic storage duration is used while it is indeterminate
  • A trap representation is read by an lvalue expression that does not have character type
  • A trap representation is produced by a side effect that modifies any part of the object using an lvalue expression that does not have character type
  • ...

C99 引入了术语陷阱表示,并且在使用时也有 UB,就像不确定值一样。字符类型(char、unsigned char 和signed char)没有陷阱表示,可用于在没有UB 的情况下对其他类型的陷阱表示进行操作。

C++ core language issue - 616. Definition of “indeterminate value”

The C++ Standard uses the phrase “indeterminate value” without defining it. C99 defines it as “either an unspecified value or a trap representation.” Should C++ follow suit?

Proposed resolution (October, 2012):

[dcl.init] paragraph 12 as follows:

If no initializer is specified for an object, the object is default-initialized. When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced (5.17 [expr.ass]). [Note: Objects with static or thread storage duration are zero-initialized, see 3.6.2 [basic.start.init]. —end note] If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases:

  • If an indeterminate value of unsigned narrow character type (3.9.1 [basic.fundamental]) is produced by the evaluation of:
  • the second or third operand of a conditional expression (5.16 [expr.cond]),
  • the right operand of a comma (5.18 [expr.comma]),
  • the operand of a cast or conversion to an unsigned narrow character type (4.7 [conv.integral], 5.2.3 [expr.type.conv], 5.2.9 [expr.static.cast], 5.4 [expr.cast]), or
  • a discarded-value expression (Clause 5 [expr]),

then the result of the operation is an indeterminate value.

If an indeterminate value of unsigned narrow character type (3.9.1 [basic.fundamental]) is produced by the evaluation of the right operand of a simple assignment operator (5.17 [expr.ass]) whose first operand is an lvalue of unsigned narrow character type, an indeterminate value replaces the value of the object referred to by the left operand.

If an indeterminate value of unsigned narrow character type (3.9.1 [basic.fundamental]) is produced by the evaluation of the initialization expression when initializing an object of unsigned narrow character type, that object is initialized to an indeterminate value.


提议的更改被接受为具有一些进一步更改的缺陷解决方案(问题 1213),但基本保持不变(对于此问题的目的足够相似)。这就是 unsigned char 的异常(exception)似乎已被引入 C++ 的地方。据我所知,核心语言问题没有关于异常(exception)原因的公开评论或注释。

关于c++ - 为什么 unsigned char 具有与其他数据类型不同的默认初始化行为?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68219526/

相关文章:

c++ - 如何克隆多个继承对象?

c++ - 内联 getter 和 setter 与公共(public)变量

c++ - 将 char* 转换为字符串 C++

java - 在 Java 中,如何将 String 转换为 char 或将 char 转换为 String?

c - 在结构中静态初始化数组

c++ - Windows Phone 8.0 C++ 成员初始化

android - cocos2d-X facebook 集成 c++

c++ - 从 char 中 cout exe 文件内容

perl - 给定/当值未定义时

C++ 自定义 std::map<> 键类导致内存冲突