C++ 引用不一致

我正在使用 yaml-cpp 库来解析 yaml。缩略示例:

YAML::Node def = YAML::LoadFile(defFile);
for (auto itemPair = def.begin(); itemPair != def.end(); ++itemPair) {
    // Grab a reference so `itemPair->second` doesn't need to be copied all over the place
    auto& item = itemPair->second;

    // A few instances of the below in series
    if (item["key"].IsDefined()) { doSomething(item["key"].as<std::string>()); }

    // Problem happens here
    if (item["issue"].IsDefined()) {
        if (!item["issue"].IsMap()) { continue; }
        for (auto x = item["issue"].begin(); x != item["issue"].end(); ++x) {
            LOG(INFO) << "Type before: " << item.Type() << " : " << itemPair->second.Type();
            auto test = x->first.as<std::string>();
            LOG(INFO) << "Type after: " << item.Type() << " : " << itemPair->second.Type();
            // Using item as a map fails because it no longer is one!
            // Next loop attempt also crashes when it attempts to use [] on item.
        }
    }

问题发生在嵌套循环中，代码片段开头的引用突然发生变化，但它引用的变量似乎没有受到影响:

I1218 12:44:04.697798 296012 main.cpp:123] Type before: 4 : 4
I1218 12:44:04.697813 296012 main.cpp:125] Type after: 2 : 4

我对引用的理解是它们充当另一个变量的别名。我知道 yaml 库可能会在幕后施展魔法，这会改变底层数据，但我无法理解为什么引用似乎正在更新，但原始值仍然存在。

编辑:这里发生了一些严重的令人震惊的行为。在对 itemPair->second.Type() 的任何调用之后，引用将“重置”回正确的值。因此，如果我添加另一个日志调用:

LOG(INFO) << "Type after: " << item.Type() << " : " << itemPair->second.Type();
LOG(INFO) << "Type afterer: " << item.Type() << " : " << itemPair->second.Type();

结果:

I1218 12:58:59.965732 297648 main.cpp:123] Type before: 4 : 4
I1218 12:58:59.965752 297648 main.cpp:125] Type after: 2 : 4
I1218 12:58:59.965766 297648 main.cpp:126] Type afterer: 4 : 4

可重现的例子:

test.yaml:

---
one:
    key: x
    issue:
        first: 1
two:
    key: y
    issue:
        first: 1
        second: 2

main.cpp 与上面相同，但带有硬编码的 test.yaml，LOG 替换为 std::cout 和模拟函数:

#include <iostream>
#include <yaml-cpp/yaml.h>

void doSomething(std::string x) { std::cout << "Got key: " << x << std::endl; }

int main() {
    YAML::Node def = YAML::LoadFile("test.yaml");
    for (auto itemPair = def.begin(); itemPair != def.end(); ++itemPair) {
        // Grab a reference so `itemPair->second` doesn't need to be copied all over the place
        auto& item = itemPair->second;

        // A few instances of the below in series
        if (item["key"].IsDefined()) { doSomething(item["key"].as<std::string>()); }

        // Problem happens here
        if (item["issue"].IsDefined()) {
            if (!item["issue"].IsMap()) { continue; }
            for (auto x = item["issue"].begin(); x != item["issue"].end(); ++x) {
                std::cout << "Type before: " << item.Type() << " : " << itemPair->second.Type() << std::endl;
                auto test = x->first.as<std::string>();
                std::cout << "Type after: " << item.Type() << " : " << itemPair->second.Type() << std::endl;
                std::cout << "Type afterer: " << item.Type() << " : " << itemPair->second.Type() << std::endl;
                // Using item as a map fails because it no longer is one!
                // Next loop attempt also crashes when it attempts to use [] on item.
            }
        }
    }
}

结果:

$ ./build/out
Got key: x
Type before: 4 : 4
Type after: 2 : 4
Type afterer: 4 : 4
Got key: y
Type before: 4 : 4
Type after: 2 : 4
Type afterer: 4 : 4
Type before: 4 : 4
Type after: 2 : 4
Type afterer: 4 : 4

最佳答案

Node旨在保存一个引用，迭代器的行为类似于指向 std::pair<Node, Node> 的指针并将返回一个临时的 Node . 如果你绑定(bind)到那个 Node ，您将绑定(bind)到一个已销毁的 Node .所以你在这里需要一份拷贝。更改 auto&至 auto将解决问题。

它是这样设计的，因为它不想让你触摸下面的内存。否则，在重新分配内存时可能会产生悬空引用。

悬挂引用的例子:

std::vector<int> v{1};
auto &ref1 = v[0];

v.reserve(100); // reallocating, causing ref1 a dangling reference.

另外，我写了为什么它是这样设计的。看这里: https://github.com/jbeder/yaml-cpp/issues/977#issuecomment-771041297 我会把它复制到这里。

这里为什么引用是UB。

使用 -> 时, 迭代器 iter在堆栈上创建临时取消引用结果，返回其指针，并在作用域后立即销毁此对象。

这是为了制作iter->second行为类似于 (*iter).second .

如果将取消引用结果放在堆上，则很难决定何时销毁该对象。

预期行为与 (*iter).second 相同.但是(*iter).second是右值，编译器不允许 auto& . iter->second 中的情况并非如此，因为编译器认为 iter->second作为左值。

C++ 标准使 p->m ，指针表达式的内置成员，一个lvalue .所以没有办法禁止绑定(bind)到引用。

总之，当

V list = iter->second;   // correct
V &list = iter->second;  // wrong
V &&list = iter->second; // COMPILE TIME ERROR
V &&list = std::move(iter->second); // still wrong

auto list = iter -> second;   // correct, list is V
auto &list = iter -> second;  // wrong,   list is V&
auto &&list = iter -> second; // wrong,   list is V&

V list = (*iter).second;   // correct
V &list = (*iter).second;  // COMPILE TIME ERROR
V &&list = (*iter).second; // correct

auto list = (*iter).second;   // correct, list is V
auto &list = (*iter).second;  // COMPILE TIME ERROR
auto &&list = (*iter).second; // correct, list is V&&

以下是作者的一些可能的修改:

制作detail::iterator_value对象长寿或只是简单地泄漏内存。
删除 operator->() .
写入文件。告诉大家使用auto .

方法 1 可能会造成很多麻烦。我认为方法 2、3 是很好的解决方案。

为什么复制在这里像引用一样工作。

在目前的设计中，每个更改都经过一个 Node . Node是底层内存的公共(public)接口(interface)。它被设计成多态性。而底层数据的真实类型是在运行时决定的，在编译时是未知的。所以auto& list = iter->second不可能绑定(bind)到正确的基础类型。

这可以通过一些努力来完成。会是这样的

auto& list = iter->second.data_as_ref<std::string>();

但还是不够方便。

在当前设计中，您可以通过以下方式获取拷贝

auto list = iter->second.as<std::string>();

您不能绑定(bind)到它。它只允许您复制，不能写入。

那是因为 Node确保您使用他的接口(interface)来分配值。这非常重要，因为分配数据意味着要做 3 件或更多件事情。

如果新数据是以下类型之一，它将对其进行编码。 std::pair, std::array, std::list, std::vector, std::map, bool, Binary
它分配数据。
它分配类型，枚举类中的一个成员NodeType .
它分配状态，一个 bool 值 isDefined .

读取时，如果数据是编码的，还需要解码。所以它不应该给你直接的写/读访问权限。

您的 ref 也可能悬空，因为可以重新分配内存。

在当前的设计中，像引用一样的复制是必须的。

结论

使用 auto iter = iter->first;或使用 (*iter).first .

关于C++ 引用不一致，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/65357508/

可重现的例子:

这里为什么引用是UB。

为什么复制在这里像引用一样工作。

结论

上一篇：python - 如何使用立体相机创建良好的深度图？

下一篇：javascript - 找不到主模块中的 WEBPACK5 错误 : Error: Can't resolve './src'