我需要用下划线(_)替换非法字符, 例如: 如果用户给定的文本是“imageЙййé.png”,需要用 __ 替换这个 Ййй 字符,因此整体输出必须是 image_ __é.png。对于法语字符,不得进行这种替换。我已经检查了下面的代码并帮助我获得输出。
<?php
$allowed_char_array=array("a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z","à","á","â","ã","ä","å","æ","ç","è","é","ê","ë","ì","í","î","ï","ñ","ò","ó","ô","õ","ö","ð","ø","œ","š","Þ","ù","ú","û","ü","ý","ÿ","ž","0","1","2","3","4","5","6","7","8","9"," ","(",")","-","_",".","@","#","$","%","*","¢","ß","¥","£","™","©","®","ª","×","÷","±","+","-","²","³","¼","½","¾","µ","¿","¶","·","¸","º","°","¯","§","…","¤","¦","≠","¬","ˆ","¨","‰");
$word = 'imageЙ ййé.png';
$file_name = url_rewrite(trim($word));
$file_name2 = strtolower($file_name);
$split = str_split($file_name2);
if(is_array($split) && is_array($allowed_char_array)){
$result=array_diff($split,$allowed_char_array);
echo '<pre>';
print_r($split);
echo '<pre>';
print_r($allowed_char_array);
echo '<pre>';
print_r($result);
}
function url_rewrite($chaine) {
// On va formater la chaine de caractère
// On remplace pour ne plus avoir d'accents
$accents = array('é','à','è','À','É','È');
$sans = array('é','à','è','À','É','È');
$chaine = str_replace($accents, $sans, $chaine);
return $chaine;
}
?>
最佳答案
我将使用白名单中的字符构建一个正则表达式(准确地说是字符类),然后删除与该类的否定相匹配的任何字符。
$allowed_char_array = array("a","b","c","d","e") // and others
$chars = implode("", $allowed_char_array);
$regex = "/[^" . $chars . "]/u";
$input = "imageЙ ййé.png";
echo $regex . "\n";
$output = preg_replace($regex, "_", $input);
echo $input . "\n" . $output;
imageЙ ййé.png
image_ __é.png
如果上面的内容不清楚,下面是 preg_replace
的实际内容:
preg_replace("/[^abcdefghijklmnopqrstuv]/u, "_", $input);
也就是说,任何非白名单字符都将被替换为下划线。我没有费心列出整个字符类,因为您的源代码中已经有了它。
请注意,正则表达式中的 /u
标志在这里至关重要,因为您的输入字符串是 UTF-8 字符串。 UTF-8 字符可能包含多个字节,在不使用 /u
的情况下对其使用 preg_replace
可能会产生意外结果。
关于php - PHP 中用下划线替换文本中的非法字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53760053/