我有以下三种情况,我希望显示相同的输出,但它对 $str1
做了一些奇怪的事情:
<?php
$str1 = "— lorem lorem Alice lorem lorem lorem loremlorem";
// | < Why would a dash make a difference in the the found index?
$str2 = "a lorem lorem Alice lorem lorem lorem loremlorem";
$str3 = " lorem lorem Alice lorem lorem lorem loremlorem";
// The found index is always the same
$foundIndex = mb_stripos($str1, "Alice");
var_dump(substr($str1, $foundIndex - 6, 24), $foundIndex);
$foundIndex = mb_stripos($str2, "Alice");
var_dump(substr($str2, $foundIndex - 6, 24), $foundIndex);
$foundIndex = mb_stripos($str3, "Alice");
var_dump(substr($str3, $foundIndex - 6, 24), $foundIndex);
输出:
string(24) "m lorem Alice lorem lore" << Why is this swapped to the right one char?
int(14)
string(24) "lorem Alice lorem lorem "
int(14)
string(24) "lorem Alice lorem lorem "
int(14)
你可以测试一下here .
我使用操作 mb_stripos
和 substr
来搜索字符串,这就是 PoC。
这是为什么?如何修复包含特殊字符的字符串的行为?
经过快速检查,我认为 —
占用了三个字节,而 substr
按字节工作,而不是按字符工作。 strlen("—")
也是 3
...
如何按字符而不是按字节对字符串进行切片?按字节切片对我来说实际上不起作用。并且所有特殊字符都应该正确处理。如果我没记错的话,表情符号也有不同的尺寸!
最佳答案
您使用的破折号是多字节字符。要执行多字节安全 substr() 操作,您需要使用 mb_substr()
。
<?php
$str1 = "— lorem lorem Alice lorem lorem lorem loremlorem";
// | < Why would a dash make a difference in the the found index?
$str2 = "a lorem lorem Alice lorem lorem lorem loremlorem";
$str3 = " lorem lorem Alice lorem lorem lorem loremlorem";
// The found index is always the same
$foundIndex = mb_stripos($str1, "Alice");
var_dump(mb_substr($str1, $foundIndex - 6, 24), $foundIndex);
$foundIndex = mb_stripos($str2, "Alice");
var_dump(mb_substr($str2, $foundIndex - 6, 24), $foundIndex);
$foundIndex = mb_stripos($str3, "Alice");
var_dump(mb_substr($str3, $foundIndex - 6, 24), $foundIndex);
关于php - substr 与字符串中的破折号一起工作很奇怪,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68740589/