我有一个脚本,可以根据关键字数组检查大量文本,然后返回计数。我现在需要它能够过滤掉可能已经包含另一个关键字的关键字。例如:
apples are delicious, especially red apples.
当前计数如下所示:
apples - 2 (counts "apples" twice)
red apples - 1
我想要的是让关键字独立,如下所示:
apples - 1
red apples - 1
我检查关键字的基本脚本:
content = ed.getContent().toLowerCase();
var words = ["apples", "red apples"];
var count = [];
for (var i = 0, len = words.length; i < len; i++) {
if (text.indexOf(words[i].toLowerCase()) > -1){
var regex = new RegExp(words[i], "g");
count[i] = (content.match(regex) || []).length;
console.log(words[i] + " " + count[i]);
}
}
我被困住了!任何朝着正确方向的帮助或插入总是非常感激!
最佳答案
有很多方法可以做到这一点。我觉得最简单的就是对文字进行排序,对内容进行 trim 。
https://jsfiddle.net/a0h7xbfu/8/
var content = "red apples apples";
var words = ["apples", "red apples"];
var count = [];
words.sort(function(a, b) {
var lenA = a.length;
var lenB = b.length;
if (lenA === lenB) {
return 0;
}
return (lenA > lenB) ? -1 : 1;
});
words.forEach(function(word) {
var regex = new RegExp(word, "g");
var match = content.match(regex);
if (match) {
console.log(word + ": " + match.length);
content = content.replace(regex, '');
}
});
关于JavaScript 关键字检查器,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48857896/