php 正则表达式匹配可能的重音字符

我发现了很多关于此的问题，但没有一个能帮助我解决我的具体问题。情况:我想用 "blablebli" 之类的东西搜索 string 并能够找到匹配所有可能的重音变体("blablebli"、"blábleblí"、"blâblèbli" 等...)在文本中。

我已经做了一个相反的解决方法(找到一个我写的没有可能重音的词)。但我想不出实现我想要的方法。

这是我的工作代码。 (相关部分，这是 foreach 的一部分，所以我们只看到一个单词搜索):

$word="something";
$word = preg_quote(trim($word)); //Just in case
$word2 = $this->removeAccents($word); // Removed all accents
if(!empty($word)) {
    $sentence = "/(".$word.")|(".$word2.")/ui"; // Now I'm checking with and without accents.
    if (preg_match($sentence, $content)){
        echo "found";
    }
}

还有我的 removeAccents() 函数(我不确定我是否用那个 preg_replace() 覆盖了所有可能的重音。到目前为止它正在工作。如果有人检查我是否遗漏了什么):

function removeAccents($string)
{
    return preg_replace('/[\`\~\']/', '', iconv('UTF-8', 'ASCII//TRANSLIT', $string));
}

我要避免的事情:

我知道我可以检查我的 $word 并将所有 a 替换为 [aàáãâä] 和与其他字母相同，但我不知道......它看到了一点点矫枉过正。
当然我可以在我的 if 中使用我自己的 removeAccents() 函数检查 $content 没有重音符号的声明，例如:
```
if (preg_match($sentence, $content) || preg_match($sentence, removeAccents($content)))
```

但我对第二种情况的问题是我想突出显示匹配后找到的单词。所以我无法更改我的$content。

有什么方法可以改进我的 preg_match() 以包含可能的重音字符？或者我应该使用上面的第一个选项吗？

最佳答案

我会分解字符串，这样可以更容易地删除有问题的字符，大致如下:

<?php

// Convert unicode input to NFKD form.
$str = Normalizer::normalize("blábleblí", Normalizer::FORM_KD);

// Remove all combining characters (https://en.wikipedia.org/wiki/Combining_character).
var_dump(preg_replace('/[\x{0300}-\x{036f}]/u', "", $str));

关于php 正则表达式匹配可能的重音字符，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/31627143/

php 正则表达式匹配可能的重音字符

上一篇：php - 如何让用户的浏览器忘记包含刷新重定向的缓存 index.html 文件？

下一篇：PHP/LDAP : Bad Search Filter (OU with Ampersand)