pointers - 为什么 `offset_from` 使用的指针必须从指向同一对象的指针派生?

标签 pointers rust

来自 standard library :

Both pointers must be derived from a pointer to the same object. (See below for an example.)

let ptr1 = Box::into_raw(Box::new(0u8));
let ptr2 = Box::into_raw(Box::new(1u8));
let diff = (ptr2 as isize).wrapping_sub(ptr1 as isize);
// Make ptr2_other an "alias" of ptr2, but derived from ptr1.
let ptr2_other = (ptr1 as *mut u8).wrapping_offset(diff);
assert_eq!(ptr2 as usize, ptr2_other as usize);
// Since ptr2_other and ptr2 are derived from pointers to different 
// objects, computing their offset is undefined behavior, even though
// they point to the same address!
unsafe {
    let zero = ptr2_other.offset_from(ptr2); // Undefined Behavior
}

我不明白为什么必须如此。

最佳答案

这与称为“出处”的概念有关,意思是“原产地”。 rust Unsafe Code Guidelines有一个关于 Pointer Provenance 的部分.它是一个非常抽象的规则,但它解释了它在编译期间使用的额外信息,有助于指导明确定义的指针转换。

// Let's assume the two allocations here have base addresses 0x100 and 0x200.
// We write pointer provenance as `@N` where `N` is some kind of ID uniquely
// identifying the allocation.
let raw1 = Box::into_raw(Box::new(13u8));
let raw2 = Box::into_raw(Box::new(42u8));
let raw2_wrong = raw1.wrapping_add(raw2.wrapping_sub(raw1 as usize) as usize);
// These pointers now have the following values:
// raw1 points to address 0x100 and has provenance @1.
// raw2 points to address 0x200 and has provenance @2.
// raw2_wrong points to address 0x200 and has provenance @1.
// In other words, raw2 and raw2_wrong have same *address*...
assert_eq!(raw2 as usize, raw2_wrong as usize);
// ...but it would be UB to dereference raw2_wrong, as it has the wrong *provenance*:
// it points to address 0x200, which is in allocation @2, but the pointer
// has provenance @1.
该指南链接到一篇好文章:Pointers Are Complicated及其后续 Pointers Are Complicated II进入更多细节并创造了这句话:

Just because two pointers point to the same address, does not mean they are equal and can be used interchangeably.


从本质上讲,即使您可以保证那里存在有效对象,通过该指针原始“分配”之外的指针读取值也是无效的。 允许这种行为可能会对语言的别名规则和可能的优化造成严重破坏。 而且几乎从来没有一个很好的理由这样做。
这个概念主要是从 C 和 C++ 继承而来的。

如果您想知道是否编写了违反此规则的代码。运行它 miri ,未定义行为分析工具,经常可以找到。
fn main() {
    let ptr1 = Box::into_raw(Box::new(0u8));
    let ptr2 = Box::into_raw(Box::new(1u8));
    let diff = (ptr2 as isize).wrapping_sub(ptr1 as isize);
    let ptr2_other = (ptr1 as *mut u8).wrapping_offset(diff);
    assert_eq!(ptr2 as usize, ptr2_other as usize);
    unsafe { println!("{} {} {}", *ptr1, *ptr2, *ptr2_other) };
}
error: Undefined Behavior: memory access failed: pointer must be in-bounds at offset 1200, but is outside bounds of alloc1444 which has size 1
 --> src/main.rs:7:49
  |
7 |     unsafe { println!("{} {} {}", *ptr1, *ptr2, *ptr2_other) };
  |                                                 ^^^^^^^^^^^ memory access failed: pointer must be in-bounds at offset 1200, but is outside bounds of alloc1444 which has size 1
  |
  = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
  = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information

关于pointers - 为什么 `offset_from` 使用的指针必须从指向同一对象的指针派生?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68400492/

相关文章:

c++ - 简单的指针初始化

postgresql - Rust cargo 找不到 postgres 包

mysql - 运行 wasm-pack 时导入 mysql 出现很多错误

c++ - 在保持常量的同时强制非空

c - C中的字符串和指针

floating-point - 使用泛型时如何使用 float 文字?

asynchronous - 当 future 包装它被丢弃时,如何停止运行同步代码?

rust - 如何从原始指针获取数组或切片?

c - 为什么 qsort 需要 4 个字节来对字符串文字进行排序?

c - 访问作为双指针传递给函数的 Typedef 结构指针。函数内部访问