serialization - 在磁盘上存储一组 protobuf

我使用 protobuf 作为序列化器来格式化磁盘上的数据。我可能有一大组 protobuf 对象，比如说，数百万个。将它们布局在磁盘上的最佳选择是什么？ protobuf对象将被顺序一一读取或通过外部索引随机访问读取。

我曾经使用lenghth(int)+protobuf_object+length(int)....格式，但是如果其中一个protobuf碰巧是脏的，它就会失败。如果许多 protobuf 对象都很小，则可能会产生一些开销。

最佳答案

如果您只需要顺序访问，存储多条消息的最简单方法是在其前面写入对象的大小，如文档所推荐的:http://developers.google.com/protocol-buffers/docs/techniques#streaming

例如，您可以使用以下成员函数创建一个“MessagesFile”类来打开、读取和写入消息:

// File is opened using append mode and wrapped into
// a FileOutputStream and a CodedOutputStream
bool Open(const std::string& filename,
          int buffer_size = kDefaultBufferSize) {

    file_ = open(filename.c_str(),
                 O_WRONLY | O_APPEND | O_CREAT, // open mode
                 S_IREAD | S_IWRITE | S_IRGRP | S_IROTH | S_ISUID); //file permissions

    if (file_ != -1) {
        file_ostream_ = new FileOutputStream(file_, buffer_size);
        ostream_ = new CodedOutputStream(file_ostream_);
        return true;
    } else {
        return false;
    }
}

// Code for append a new message
bool Serialize(const google::protobuf::Message& message) {
    ostream_->WriteLittleEndian32(message.ByteSize());
    return message.SerializeToCodedStream(ostream_);
}

// Code for reading a message using a FileInputStream
// wrapped into a CodedInputStream 
bool Next(google::protobuf::Message *msg) {
    google::protobuf::uint32 size;
    bool has_next = istream_->ReadLittleEndian32(&size);
    if(!has_next) {
        return false;
    } else {
        CodedInputStream::Limit msgLimit = istream_->PushLimit(size);
        if ( msg->ParseFromCodedStream(istream_) ) {
            istream_->PopLimit(msgLimit);
            return true;
        }
        return false;
    }
}

然后，要编写您的消息，请使用:

MessagesFile file;
reader.Open("your_file.dat");

file.Serialize(your_message1);
file.Serialize(your_message2);
...
// close the file

要阅读您的所有消息:

MessagesFile reader;
reader.Open("your_file.dat");

MyMsg msg;
while( reader.Next(&msg) ) {
    // user your message
}
...
// close the file

关于serialization - 在磁盘上存储一组 protobuf，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/14227355/

serialization - 在磁盘上存储一组 protobuf

上一篇：php - 基于键将 5 个数组与 php 组合

下一篇：push - NetBeans 非快进更新