我正在使用 PHP preg_match_all,这是我目前所能得到的......
[A-Za-z+\W]+\s[\d]
唯一的问题是我需要\W 不是 "
。
所以我试过:
[A-Za-z+[^\dA-Za-z"]\s?]+\s[\d]
[A-Za-z+]\s?+[^A-Za-z\d"]?\s[\d]
除其他外,它只是失败了,我真的不明白为什么。
编辑:
这是完整的正则表达式;
([A-Z][a-z]+\s){1,5}\s?[^a-zA-Z\d\s:,.\'\"]\s?
[A-Za-z+\W]+\s[\d]{1,2}\s[A-Z][a-z]+\s[\d]{4}
我把它分成两行,第二行以我发布的内容开头。
试图匹配的模式:
India – Adulterated Tea Powder Seized 18 April 2011
India – Importer of Haldiram’s Petha Sweet Cubes Issuing Voluntary Recall 26 April 2011
India – Undeclared Gluten Found in Sweets by Canadian Authorities 27 April 2011
India – Adulteration Found in Edible Oils 28 April 2011
India – Viral Disease Affects Chili Crop in Goa 28 April 2011
NOT ----> Chili – India: Goa”. 8 April 2011
Ivory Coast – Potential Cocoa Quality Decline despite Sufficient Surplus 11 April 2011
Japan – Sanuki Kanzume Co. and Failure to Comply with FDA Standards 27 April 2011
Madagascar – Toxic Sardines 14 April 2011
Madagascar – Update: Toxic Sardines 26 April 2011
最佳答案
您显示的模式匹配所有字母和非单词字符。唯一不包含在模式中的是数字,您也不想匹配双引号。
[^\d\"_]+\s\d
编辑:
我可能是错的,但从示例输入来看,您似乎只是在尝试匹配所有没有双引号的行。如果是这样的话,这样的事情就容易多了,我什至将日期与字符串的其余部分分开进行了分组。如果您不需要对 sting/date 进行分组,则只需删除所有括号。
<?php
error_reporting(E_ALL);
$str = " India - Adulterated Tea Powder Seized 18 April 2011
India - Importer of Haldiram’s Petha Sweet Cubes Issuing Voluntary Recall 26 April 2011
India - Undeclared Gluten Found in Sweets by Canadian Authorities 27 April 2011
India - Adulteration Found in Edible Oils 28 April 2011
India - Viral Disease Affects Chili Crop in Goa 28 April 2011
Chili - India: Goa\". 8 April 2011
Ivory Coast - Potential Cocoa Quality Decline despite Sufficient Surplus 11 April 2011
Japan - Sanuki Kanzume Co. and Failure to Comply with FDA Standards 27 April 2011
Madagascar - Toxic Sardines 14 April 2011
Madagascar - Update: Toxic Sardines 26 April 2011";
preg_match_all("/^([^\"]+?)(\d?\d\s[a-z]+\s\d{4})$/im", $str, $m);
echo '<pre>'.print_r($m, true).'</pre>';
?>
关于php - 如何使用 RegEx 排除 [ ] 中的符号,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/6060280/