我正在使用一个将字符串转录成字节数组的函数,我在 PHP 和 JavaScript 中都有这个函数,但是当我播放这些字符时它们有不同的行为:㬁欲膘ƘჀ䚐⦀飠喔&ӡ㏃桩쌌䌠
如何使结果相同?
我的代码:
function bytesFromWords($string) {
$bytes = array();
$j = strlen($string);
for($i = 0; $i < $j; $i++) {
$char = ord(mb_substr($string, $i, 1));
$bytes[] = $char >> 8;
$bytes[] = $char & 0xFF;
}
return $bytes;
}
echo bytesFromWords('㬁愃膘ƘჀ䚐⦀飠噋&ӡ㏃棱쌌ص䌠'); // result: 0,227,0,172,0,129,0,230,0,132,0,131,0,232,0,134,0,152,0,198,0,152,0,225,0,131,0,128,0,228,0,154,0,144,0,226,0,166,0,128,0,233,0,163,0,160,0,229,0,153,0,139,0,38,0,211,0,161,0,224,0,185,0,168,0,227,0,143,0,131,0,230,0,163,0,177,0,236,0,140,0,140,0,216,0,181,0,228,0,140,0,160
function bytesFromWords (string) {
var bytes = [];
for(var i = 0; i < string.length; i++) {
var char = string.charCodeAt(i);
bytes.push(char >>> 8);
bytes.push(char & 0xFF);
}
return bytes;
}
console.log(bytesFromWords('㬁愃膘ƘჀ䚐⦀飠噋&ӡ㏃棱쌌ص䌠').toString()); // result: 59,1,97,3,129,152,1,152,16,192,70,144,41,128,152,224,86,75,0,38,4,225,14,104,51,195,104,241,195,12,6,53,67,32
最佳答案
问题:
-
strlen
不按预期计算 Unicode 字符数。 -
ord
无法按预期使用 unicode。 -
chr
无法按预期使用 unicode。
strlen
有问题
'㬁愃膘ƘჀ䚐⦀飠噋&ӡ㏃棱쌌ص䌠'.length
返回 17 和 strlen('㬁愃膘ƘჀ䚐⦀飠噋&ӡ㏃棱쌌ص䌠')
返回 46,要修复它,请使用:
$j = preg_match_all('/.{1}/us', $string, $data);
ord
有问题
使用 '㬁'.charCodeAt(0)
返回 15105 和 ord('㬁')
返回 227,用于固定用途:
function unicode_ord($char) {
list(, $ord) = unpack('N', mb_convert_encoding($char, 'UCS-4BE', 'UTF-8'));
return $ord;
}
来源:https://stackoverflow.com/a/10333307/1518921
chr
有问题
使用 String.fromCharCode(15104)
返回 㬁
和 chr(15104)
返回空/空白,用于修复:
function unicode_chr($u) {
return mb_convert_encoding('&#' . intval($u) . ';', 'UTF-8', 'HTML-ENTITIES');
}
来源:https://stackoverflow.com/a/9878531/1518921
完整代码:
<?php
function unicode_ord($char) {
list(, $ord) = unpack('N', mb_convert_encoding($char, 'UCS-4BE', 'UTF-8'));
return $ord;
}
function unicode_chr($u) {
return mb_convert_encoding('&#' . intval($u) . ';', 'UTF-8', 'HTML-ENTITIES');
}
function bytesToWords($bytes) {
$str = '';
$j = count($bytes);
for($i = 0; $i < $j; $i += 2) {
$char = $bytes[$i] << 8;
if ($bytes[$i + 1]) {
$char |= $bytes[$i + 1];
}
$str .= unicode_chr($char);
}
return $str;
}
function bytesFromWords($string) {
$bytes = array();
$j = preg_match_all('/.{1}/us', $string, $data);
$data = $data[0];
foreach ($data as $char) {
$char = unicode_ord($char);
$bytes[] = $char >> 8;
$bytes[] = $char & 0xFF;
}
return $bytes;
}
$data = bytesFromWords('㬁愃膘ƘჀ䚐⦀飠噋&ӡ㏃棱쌌ص䌠');
echo implode(', ', $data), '<br>';
echo bytesToWords($data);
关于javascript - php的输出字节数组差异版本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29785132/