c++ - 如何在 C++ 中读取用 utf-8 编码的 java unicode 字节字符串

我有原始消息，它作为具有以下语法的字符串存储在 mongo 中

data.toByteString().toStringUtf8();

这只不过是编码为 utf8 的 unicode。

现在我正在尝试使用下面的方法从 mongo 的 C++ 端读取相同内容 -

std::wstring str(mongoData.get_utf8().value.to_string().begin(), mongoData.get_utf8().value.to_string().end());

String str1(boost::locale::conv::utf_to_utf<char>(str.c_str(), str.c_str()+str.size());

但是执行上述操作时，str1 会提供损坏的数据。

请帮助我做错了什么。谢谢。

最佳答案

这里只是猜测:mongoData.get_utf8().value.to_string()按值返回一个字符串。

这意味着开始和结束迭代器完全无关，因为它们来自不同的字符串。

对此的简单解决方案是创建您自己的字符串拷贝并从该拷贝中获取迭代器:

auto mongo_string = mongoData.get_utf8().value.to_string();
std::wstring str(mongo_string.begin(), mongo_string().end());

关于c++ - 如何在 C++ 中读取用 utf-8 编码的 java unicode 字节字符串，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/59152888/

上一篇：c++ - 下面的OpenCV源码有没有错误？ (没有#else 的#ifdef)

下一篇：c++ - OpenGL 只渲染一个黑色方 block

相关文章：

C++ - 动态加载行为

包含原始指针的对象的 C++ 智能指针

python - 使用 Python 将 Unicode 编码为 iso8859-15

c# - 在 C# 中重定向包含伪 loc (unicode) 字符串的 ConsoleOutput

unicode - 当字符是日语时专辑名称会被损坏

c++ - 如何将 bsoncxx::document::element 写入控制台

c++ - 基于 Ribbon 的 MFC 应用程序是否可以在 Windows pre-Vista/7 上运行？

c++ - 在 C++ 流中禁用指针输出？

c++ - mongocxx : Inserting a Datetime

c++ - 如何使用 C++ 在 MongoDB 中创建地理空间索引