我正在尝试使用 boost 序列化库序列化大型(几何)图形结构。
我将我的图存储为邻接表,即我的结构如下:
class Node {
double x,y;
std::vector<Node*> adjacent_nodes;
...
}
class Graph {
std::vector<Node*> nodes;
...
}
现在有 > 10k 个节点,我的问题是,当我开始序列化(保存)我的图时,它会在返回之前递归调用所有这些节点的序列化,因为图是连接的。
更准确地说,在序列化图形时,它将首先序列化“节点” vector 中的第一个节点。这样做时,它需要序列化第一个节点的“adjacent_nodes”,例如包含第二个节点。
因此它需要在返回第一个节点的序列化之前序列化第二个节点,依此类推。
我找到了 this thread从 2010 年开始,有人解释了完全相同的问题。然而,他们并没有在那里找到可行的解决方案。
如有任何帮助,我们将不胜感激。
我的结构更详细:
class Node {
double x,y;
std::vector<Node*> adjacent_nodes;
public:
inline double get_x() const { return x; }
inline double get_y() const { return y; }
inline std::vector<Node*> const& get_adjacent_nodes() const { return adjacent_nodes; }
Node (double x, double y):x(x),y(y) {}
void add_adjacent(Node* other) {
adjacent_nodes.push_back(other);
}
private:
Node() {}
friend class boost::serialization::access;
template <class Archive>
void serialize(Archive &ar, const unsigned int) {
ar & x;
ar & y;
ar & adjacent_nodes;
}
};
class Simple_graph {
std::vector<Node*> nodes;
void add_edge(int firstIndex, int secondIndex) {
nodes[firstIndex]->add_adjacent(nodes[secondIndex]);
nodes[secondIndex]->add_adjacent(nodes[firstIndex]);
}
public:
/* methods to get the distance of points, to read in the nodes, and to generate edges */
~Simple_graph() {
for (auto node: nodes) {
delete node;
}
}
private:
friend class boost::serialization::access;
template <class Archive>
void serialize(Archive &ar, const unsigned int) {
ar & nodes;
}
};
编辑:添加在上述线程中提出的一些建议,引用 Dominique Devienne:
1) save all the nodes without their topology info on a first pass of the vector, thus recording all the "tracked" pointers for them, then write the topology info for each, since then you don't "recurse", you only write a "ref" to an already serialized pointer.
2) have the possibility to write a "weak reference" to a pointer, which only adds the pointer to the "tracking" map with a special flag saying it wasn't "really" written yet, such that writing the topology of a node that wasn't yet written is akin to "forward references" to those neighboring nodes. Either the node will really be written later on, or it never will, and I suppose serialization should handle that gracefully.
#1 doesn't require changes in boost serialization, but puts the onus on the client code. Especially since you have to "externally" save the neighbors, so it's no longer well encapsulated, and writing a subset of the surface's nodes become more complex.
#2 would require seeking ahead to read the actual object when encountering a forward reference, and furthermore a separate map to know where to seek for it. That may be incompatible with boost serialization (I confess to be mostly ignorant about it).
现在可以实现这些建议吗?
最佳答案
由于您已经拥有一个包含指向所有节点的指针的 vector ,因此您可以使用索引序列化 adjacent_nodes
vector ,而不是序列化实际的节点数据。
序列化节点时,您需要将 this
指针转换为索引。如果您可以将节点索引存储在节点中,这是最简单的,否则您将不得不搜索 nodes
以找到正确的指针(可以通过创建某种关联容器来加快此过程将指针映射到索引)。
当您需要读取数据时,您可以创建初始的 nodes
vector ,其中填充了指向空/虚拟节点的指针(这些节点将在序列化时填充)。
如果这不可行,您可以将节点索引加载到一个临时数组中,然后在所有节点都被读入后返回并填充指针。但是您不必查找或重新读取任何部分你的文件。
关于c++ - 使用 boost::serialization 序列化递归图结构时如何防止堆栈溢出?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42117697/