c++ - 使用 boost::serialization 序列化递归图结构时如何防止堆栈溢出？

我正在尝试使用 boost 序列化库序列化大型(几何)图形结构。

我将我的图存储为邻接表，即我的结构如下:

class Node {
  double x,y;
  std::vector<Node*> adjacent_nodes;
  ...
}

class Graph {
  std::vector<Node*> nodes;
  ...
}

现在有 > 10k 个节点，我的问题是，当我开始序列化(保存)我的图时，它会在返回之前递归调用所有这些节点的序列化，因为图是连接的。

更准确地说，在序列化图形时，它将首先序列化“节点” vector 中的第一个节点。这样做时，它需要序列化第一个节点的“adjacent_nodes”，例如包含第二个节点。

因此它需要在返回第一个节点的序列化之前序列化第二个节点，依此类推。

我找到了 this thread从 2010 年开始，有人解释了完全相同的问题。然而，他们并没有在那里找到可行的解决方案。

如有任何帮助，我们将不胜感激。

我的结构更详细:

class Node {

    double x,y;
    std::vector<Node*> adjacent_nodes;

public:

    inline double get_x() const { return x; }
    inline double get_y() const { return y; }
    inline std::vector<Node*> const& get_adjacent_nodes() const { return adjacent_nodes; }

    Node (double x, double y):x(x),y(y) {}

    void add_adjacent(Node* other) {
        adjacent_nodes.push_back(other);
    }

private:

    Node() {}

  friend class boost::serialization::access;
  template <class Archive>
  void serialize(Archive &ar, const unsigned int) {
    ar & x;
        ar & y;
        ar & adjacent_nodes;
  }

};

class Simple_graph {

std::vector<Node*> nodes;

void add_edge(int firstIndex, int secondIndex) {
    nodes[firstIndex]->add_adjacent(nodes[secondIndex]);
    nodes[secondIndex]->add_adjacent(nodes[firstIndex]);
}

public:

/* methods to get the distance of points, to read in the nodes, and to generate edges */

~Simple_graph() {
    for (auto node: nodes) {
        delete node;
    }
}

private:

  friend class boost::serialization::access;
  template <class Archive>
  void serialize(Archive &ar, const unsigned int) {
    ar & nodes;
  }

};

编辑:添加在上述线程中提出的一些建议，引用 Dominique Devienne:

1) save all the nodes without their topology info on a first pass of the vector, thus recording all the "tracked" pointers for them, then write the topology info for each, since then you don't "recurse", you only write a "ref" to an already serialized pointer.

2) have the possibility to write a "weak reference" to a pointer, which only adds the pointer to the "tracking" map with a special flag saying it wasn't "really" written yet, such that writing the topology of a node that wasn't yet written is akin to "forward references" to those neighboring nodes. Either the node will really be written later on, or it never will, and I suppose serialization should handle that gracefully.

#1 doesn't require changes in boost serialization, but puts the onus on the client code. Especially since you have to "externally" save the neighbors, so it's no longer well encapsulated, and writing a subset of the surface's nodes become more complex.

#2 would require seeking ahead to read the actual object when encountering a forward reference, and furthermore a separate map to know where to seek for it. That may be incompatible with boost serialization (I confess to be mostly ignorant about it).

现在可以实现这些建议吗？

最佳答案

由于您已经拥有一个包含指向所有节点的指针的 vector ，因此您可以使用索引序列化 adjacent_nodes vector ，而不是序列化实际的节点数据。

序列化节点时，您需要将 this 指针转换为索引。如果您可以将节点索引存储在节点中，这是最简单的，否则您将不得不搜索 nodes 以找到正确的指针(可以通过创建某种关联容器来加快此过程将指针映射到索引)。

当您需要读取数据时，您可以创建初始的 nodes vector ，其中填充了指向空/虚拟节点的指针(这些节点将在序列化时填充)。

如果这不可行，您可以将节点索引加载到一个临时数组中，然后在所有节点都被读入后返回并填充指针。但是您不必查找或重新读取任何部分你的文件。

关于c++ - 使用 boost::serialization 序列化递归图结构时如何防止堆栈溢出？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/42117697/

c++ - 使用 boost::serialization 序列化递归图结构时如何防止堆栈溢出？

上一篇：c++ - 从子调用 QtWidget 父方法

下一篇：c++ - QDomDocument : setContent() return false