dart - 如何在Dart中进行字节和Unicode之间的转换?

标签 dart unicode byte codepoint

我尝试在How to work with char types in Dart? (Print alphabet)中实现Irn答案,但没有确切地知道该怎么做。

示例在我的Dart代码中©首字母 ,点在上方的表示为字节[304] ,我必须用服务器字节替换此字段,并用服务器字符串ott_strong字节[152]代替string作为string_rstrong 。

测试1:

cpToLatin9(int cp) => const {0x11e: 0xd0, 0x11f: 0xf0, 0x130: 0xdd, 0x131: 0xfd, 0x15e: 0xde, 0x15f: 0xfe}[cp] ?? cp;
latin9ToCp(int latin9Char) => const {0xd0: 0x11e, 0xf0: 0x11f, 0xdd: 0x130, 0xfd: 0x131, 0xde: 0x15e, 0xfe: 0x15f}[latin9Char] ?? latin9Char;


String iso08859_9 = "çÇğĞıİşŞöÖüÜ";
List<int> bytes = utf8.encode(iso08859_9);

 for (var i = 0; i < bytes.length; i++) {
    int byte = bytes[i];
    var latin9ToCodePoint = latin9ToCp(byte);
    print("latin9ToCodePoint: $latin9ToCodePoint");

    var latin9ToCodePointChar = String.fromCharCode(latin9ToCodePoint);
    print("latin9ToCodePointChar: $latin9ToCodePointChar");
  }

输出:
iso08859_9 bytes: [195, 167, 195, 135, 196, 159, 196, 158, 196, 177, 196, 176, 197, 159, 197, 158, 195, 182, 195, 150, 195, 188, 195, 156]
latin9ToCodePoint: 195
latin9ToCodePointChar: Ã
latin9ToCodePoint: 167
latin9ToCodePointChar: §
latin9ToCodePoint: 195
latin9ToCodePointChar: Ã
latin9ToCodePoint: 135
latin9ToCodePointChar: 
latin9ToCodePoint: 196
latin9ToCodePointChar: Ä
latin9ToCodePoint: 159
latin9ToCodePointChar: 
latin9ToCodePoint: 196
latin9ToCodePointChar: Ä
latin9ToCodePoint: 158
latin9ToCodePointChar: 
latin9ToCodePoint: 196
latin9ToCodePointChar: Ä
latin9ToCodePoint: 177
latin9ToCodePointChar: ±
latin9ToCodePoint: 196
latin9ToCodePointChar: Ä
latin9ToCodePoint: 176
latin9ToCodePointChar: °
latin9ToCodePoint: 197
latin9ToCodePointChar: Å
latin9ToCodePoint: 159
latin9ToCodePointChar: 
latin9ToCodePoint: 197
latin9ToCodePointChar: Å
latin9ToCodePoint: 158
latin9ToCodePointChar: 
latin9ToCodePoint: 195
latin9ToCodePointChar: Ã
latin9ToCodePoint: 182
latin9ToCodePointChar: ¶
latin9ToCodePoint: 195
latin9ToCodePointChar: Ã
latin9ToCodePoint: 150
latin9ToCodePointChar: 
latin9ToCodePoint: 195
latin9ToCodePointChar: Ã
latin9ToCodePoint: 188
latin9ToCodePointChar: ¼
latin9ToCodePoint: 195
latin9ToCodePointChar: Ã
latin9ToCodePoint: 156
latin9ToCodePointChar: 

测试2:
如果我更改列表字节= utf8.encode(iso08859_9);

列表字节= iso08859_9.codeUnits; 我得到了不同的结果。

在我的Dart中,我测试了iso08859_9字符是否表示为 iso08859_9字节:

[231、199、287、286、305、304、351、350、246、214、252、220]

现在我的主要问题是将字节更改为,将IBM CP字节更改为

[135、128、167、166、141、152、159、158、148、153、129、154]

当我这样做并使用socket.add(ibmcp_bytes)服务器时,该字符也不可读。

更新:

试试看,找出1和2个字节作为结果。 ???
print("ç".runes.length);
print(utf8.encode("ç").length);

最佳答案

今天才发现-

// Random Unicode Byte Array
var someArray = [50, 198, 167, 52, 198, 167, 54, 198, 167, 67, 111, 108, 111, 114, 40, 48, 120, 102, 102, 100, 99, 100, 100, 101, 50, 41, 207, 159, 48, 207, 159, 48, 207, 159, 49, 207, 159, 104, 116, 116, 112, 115, 58, 47, 47, 99, 111, 109, 112, 97, 116, 105, 111, 46, 103, 105, 116, 104, 117, 98, 46, 105, 111, 47, 116, 114, 97, 110, 115, 112, 97, 114, 101, 110, 116, 70, 105, 108, 108, 101, 114, 46, 112, 110, 103, 207, 159, 73, 109, 97, 103, 101, 82, 101, 112, 101, 97, 116, 46, 110, 111, 82, 101, 112, 101, 97, 116, 207, 159, 66, 111, 120, 70, 105, 116, 46, 99, 111, 118, 101, 114, 207, 159, 50, 48, 207, 159, 98, 111, 116, 116, 111, 109, 76, 101, 102, 116, 198, 167, 72, 111, 109, 101, 80, 97, 103, 101, 67, 111, 109, 112, 111, 110, 101, 110, 116, 84, 121, 112, 101, 115, 46, 66, 97, 110, 110, 101, 114, 49, 198, 171, 54, 198, 171, 50, 198, 171, 79, 112, 101, 110, 32, 82, 111, 97, 100, 115, 207, 159, 83, 105, 122, 105, 110, 103, 46, 72, 101, 97, 100, 76, 105, 110, 101, 50, 207, 159, 67, 111, 108, 111, 114, 40, 48, 120, 102, 102, 102, 102, 102, 102, 102, 102, 41, 207, 159, 84, 101, 120, 116, 65, 108, 105, 103, 110, 46, 114, 105, 103, 104, 116, 198, 132, 207, 159, 83, 105, 122, 105, 110, 103, 46, 83, 117, 98, 116, 105, 116, 108, 101, 50, 207, 159, 67, 111, 108, 111, 114, 40, 48, 120, 102, 102, 102, 102, 102, 102, 102, 102, 41, 207, 159, 84, 101, 120, 116, 65, 108, 105, 103, 110, 46, 99, 101, 110, 116, 101, 114, 198, 132, 83, 104, 111, 112, 32, 67, 111, 108, 108, 101, 99, 116, 105, 111, 110, 207, 159, 83, 105, 122, 105, 110, 103, 46, 66, 117, 116, 116, 111, 110, 207, 159, 67, 111, 108, 111, 114, 40, 48, 120, 102, 102, 48, 48, 48, 48, 48, 48, 41, 207, 159, 99, 101, 110, 116, 101, 114, 207, 159, 67, 111, 108, 111, 114, 40, 48, 120, 102, 102, 101, 99, 97, 101, 49, 57, 41, 207, 159, 104, 116, 116, 112, 58, 47, 47, 115, 117, 110, 115, 104, 105, 110, 101, 98, 105, 107, 101, 46, 99, 111, 109, 47, 207, 159, 49, 50, 53, 207, 159, 51, 48, 207, 159, 67, 111, 108, 111, 114, 40, 48, 120, 48, 48, 48, 48, 48, 48, 48, 48, 41, 198, 132, 67, 111, 108, 111, 114, 40, 48, 120, 102, 102, 48, 48, 48, 48, 48, 48, 41, 207, 159, 48, 207, 159, 48, 207, 159, 48, 46, 50, 57, 48, 52, 55, 54, 49, 57, 48, 52, 55, 54, 49, 56, 57, 54, 55, 207, 159, 104, 116, 116, 112, 115, 58, 47, 47, 99, 111, 109, 112, 97, 116, 105, 111, 46, 103, 105, 116, 104, 117, 98, 46, 105, 111, 47, 111, 112, 101, 110, 45, 114, 111, 97, 100, 46, 106, 112, 103, 207, 159, 73, 109, 97, 103, 101, 82, 101, 112, 101, 97, 116, 46, 110, 111, 82, 101, 112, 101, 97, 116, 207, 159, 66, 111, 120, 70, 105, 116, 46, 99, 111, 118, 101, 114, 207, 159, 48, 207, 159, 99, 101, 110, 116, 101, 114]; 

// Initialize object which decodes unicode -- cannot decode statically
Utf8Decoder decode=new Utf8Decoder(); 

// Convert a list of Unicode bytes to String
// Decodes to something like 2Ƨ4Ƨ6ƧColor(0xffdcdde2)ϟ0ϟ0ϟ1ϟ...
String content = decode.convert(someArray); 

关于dart - 如何在Dart中进行字节和Unicode之间的转换?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55477965/

相关文章:

c++ - 支持和反对在跨平台库中专门支持 std::wstring 的论点

c# - 为什么 File.ReadAllBytes 结果与使用 File.ReadAllText 时不同?

algorithm - 查找和为 100 的 4 个正整数的所有列表

android - 如何使用Flutter App读取Android 10设备文件

dart - 在不删除 src 属性的情况下注入(inject)图像

javascript - 在 JavaScript 中识别(日语)Unicode 数字。这可以做得更简单吗?

java - 如何在java中绘制一个unicode字符?

vb.net - 在VB.Net中将字节数组转换为整数

google-app-engine - 将十六进制值存储为字节导致丢失

websocket - Dart 如何编写一个简单的 web-socket echo 应用程序