我需要在 PHP 中重写此 javascript 正则表达式以与 preg_replace 一起使用:
var PATTERN = /([\ud800-\udbff])([\udc00-\udfff])/g;
如果我使用:
$strText = preg_replace("/([\ud800-\udbff])([\udc00-\udfff])/", "emoji", $strText);
我得到:
Compilation failed: PCRE does not support \L, \l, \N{name}, \U, or \u at offset 3
最佳答案
尝试以下操作:
preg_replace("/([\x{d800}-\x{dbff}])([\x{dc00}-\x{dfff}])/u", "emoji", $strText);
PCRE 不支持 \uXXXX
格式,因此您可以使用 \x{XXXX}
代替。此外,您还需要 u
修饰符(位于正则表达式末尾)来处理 UTF-8
来自 http://www.regular-expressions.info/unicode.html 的语法信息
Perl and PCRE do not support the \uFFFF syntax. They use \x{FFFF} instead.
来自http://php.net/manual/en/reference.pcre.pattern.modifiers.php的u
修饰符的信息
u (PCRE_UTF8) This modifier turns on additional functionality of PCRE that is incompatible with Perl. Pattern and subject strings are treated as UTF-8. This modifier is available from PHP 4.1.0 or greater on Unix and from PHP 4.2.3 on win32. UTF-8 validity of the pattern and the subject is checked since PHP 4.3.5. An invalid subject will cause the preg_* function to match nothing; an invalid pattern will trigger an error of level E_WARNING. Five and six octet UTF-8 sequences are regarded as invalid since PHP 5.3.4 (resp. PCRE 7.3 2007-08-28); formerly those have been regarded as valid UTF-8.
关于javascript - 在 PHP 中编写 Javascript UTF 正则表达式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26389545/