我正在尝试在 Armadillo 中使用稀疏矩阵功能,并且在序列化它时遇到了一些麻烦。我正在处理的矩阵非常大,并且在组件中大部分为零,因此使用 sp_mat 是有意义的。这是代码:
#include <iostream>
#include <fstream>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/binary_iarchive.hpp>
#include <armadillo>
#include <boost/serialization/split_member.hpp>
BOOST_SERIALIZATION_SPLIT_FREE(arma::sp_mat)
namespace boost {
namespace serialization {
template<class Archive>
void save(Archive & ar, const arma::sp_mat &t, unsigned int version)
{
ar & t.n_rows;
ar & t.n_cols;
const double *data = t.memptr();
for(int K=0; K<t.n_elem; ++K)
ar & data[K];
}
template<class Archive>
void load(Archive & ar, arma::sp_mat &t, unsigned int version)
{
int rows, cols;
ar & rows;
ar & cols;
t.set_size(rows, cols);
double *data = t.memptr();
for(int K=0; K<t.n_elem; ++K)
ar & data[K];}
}}
int main() {
arma::mat C(3,3, arma::fill::randu);
C(1,1) = 0; //example so that a few of the components are u
C(1,2) = 0;
C(0,0) = 0;
C(2,1) = 0;
C(2,0) = 0;
arma::sp_mat A = arma::sp_mat(C);
std::ofstream outputStream;
outputStream.open("bin.dat");
std::ostringstream oss;
boost::archive::binary_oarchive oa(outputStream);
oa & A;
outputStream.close();
arma::sp_mat B;
std::ifstream inputStream;
inputStream.open("bin.dat", std::ifstream::in);
boost::archive::binary_iarchive ia(inputStream);
ia & B;
return 0;
}
当前的问题是 sp_mat 没有 mempr() 成员,因此序列化已完成的组件,例如第 10-12 行不适用于 sp_mat。我很好奇是否有人知道解决方法?我觉得奇怪的是,当我单独打印 A 的所有组件时,即使稀疏矩阵忽略了零,即使零仍在内存中。例如。我打印了 A(1,1),得到了 0。这也是打印时 A 的样子:
[matrix size: 3x3; n_nonzero: 4; density: 44.44%]
(1, 0) 0.2505
(0, 1) 0.9467
(0, 2) 0.2513
(2, 2) 0.5206
最佳答案
矩阵中的元素个数始终为 n × m
,无论存储策略如何(稀疏或密集)。
因此,您不应该对能够读取“0”单元格感到惊讶——它们可能不会被存储,但很明显它们对计算很重要,因此您应该能够检索它们的值。
鉴于这些,您的草图(我认为是从某些特定于非稀疏矩阵的代码中复制/粘贴的 memptr()
)将始终存储非稀疏数据(您迭代所有 n_elems
)。但是data
不能指向一些连续的存储,因为除非内存布局直接匹配矩阵尺寸(密集存储,行优先或列优先),否则矩阵如何知道哪些单元格是哪些。
根据来自 Returning locations and values of a sparse matrix in armadillo c++ 的信息这是一个固定的实现:
完整代码(在我的机器上测试):
#include <armadillo>
#include <boost/archive/binary_iarchive.hpp>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/serialization/split_member.hpp>
#include <fstream>
#include <iostream>
BOOST_SERIALIZATION_SPLIT_FREE(arma::sp_mat)
namespace boost { namespace serialization {
template<class Archive>
void save(Archive & ar, const arma::sp_mat &t, unsigned) {
ar & t.n_rows & t.n_cols & t.n_nonzero;
for (auto it = t.begin(); it != t.end(); ++it) {
ar & it.row() & it.col() & *it;
}
}
template<class Archive>
void load(Archive & ar, arma::sp_mat &t, unsigned) {
uint64_t r, c, nz;
ar & r & c & nz;
t.zeros(r, c);
while (nz--) {
double v;
ar & r & c & v;
t(r, c) = v;
}
}
}} // namespace boost::serialization
int main() {
arma::mat C(3, 3, arma::fill::randu);
C(0, 0) = 0;
C(1, 1) = 0; // example so that a few of the components are u
C(1, 2) = 0;
C(2, 0) = 0;
C(2, 1) = 0;
{
arma::sp_mat const A = arma::sp_mat(C);
assert(A.n_nonzero == 4);
A.print("A: ");
std::ofstream outputStream("bin.dat", std::ios::binary);
boost::archive::binary_oarchive oa(outputStream);
oa& A;
}
{
std::ifstream inputStream("bin.dat", std::ios::binary);
boost::archive::binary_iarchive ia(inputStream);
arma::sp_mat B(3,3);
B(0,0) = 77; // some old data should be cleared
ia& B;
B.print("B: ");
}
}
打印A:
[matrix size: 3x3; n_nonzero: 4; density: 44.44%]
(1, 0) 0.2505
(0, 1) 0.9467
(0, 2) 0.2513
(2, 2) 0.5206
B:
[matrix size: 3x3; n_nonzero: 4; density: 44.44%]
(1, 0) 0.2505
(0, 1) 0.9467
(0, 2) 0.2513
(2, 2) 0.5206
关于c++ - 从 Armadillo boost 序列化稀疏矩阵,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61168585/