node.js - 正则表达式忽略字符集前面的单词

我尝试将以下字符串与正则表达式匹配:

286,879 in Home & Kitchen (See Top 100 in Home & Kitchen)  
339 in Cardboard Cutouts    
2,945 in Jigsaws (Toys & Games)

这是我的代码/正则表达式:

            const matches = text.matchAll(/(?<!Top )([\d,|]+) in[\s\n ]([\w&'\s]+)/g);
            for(const match of matches){
                const rank = parseInt(match[1].replace(/[^\d]/g, ''));
                const category = match[2].trim()
                console.log(`${category} = ${rank}`)
            }

但是，它唯一应该匹配的部分是:家居和厨房中的 286,879、纸板 Papercut 中的 339、拼图中的 2,945(玩具和游戏) )

预期输出应该是:

Home & Kitchen = 286879

Cardboard Cutouts = 339

Jigsaws = 2945

如何调整正则表达式以忽略 Home & Kitchen 中的 100 字符串

谢谢

最佳答案

您可以使用 2 个捕获组:

(?<!Top\s+)\b(\d+(?:,\d+)?)\s+in\s+([^()\n]*[^\s()])

说明

(?<!Top\s+)负向回顾，断言不是 Top紧接着当前位置左侧的 1 个以上空白字符。
\b用于防止部分单词匹配的单词边界
(\d+(?:,\d+)?)捕获组 1，将 1 个以上数字与可选的 , 匹配和 1+ 位数字
\s+in\s+匹配in 1 个以上空白字符之间
(捕获组 2
- [^()\n]*[^\s()]匹配除换行符和 ( 之外的可选字符)
)关闭组 2

Regex demo

const regex = /(?<!Top\s+)\b(\d+(?:,\d+)?)\s+in\s+([^()\n]*[^\s()])/;

[
  "const str = `286,879 in Home & Kitchen (See Top 100 in Home & Kitchen)",
  "339 in Cardboard Cutouts",
  "2,945 in Jigsaws (Toys & Games)`;"
].forEach(s => {
  const m = s.match(regex);
  if (m) {
    console.log(`${m[2]} = ${m[1].replace(",", "")}`)
  }
})

请注意，使用 \s也可以匹配换行符。

关于node.js - 正则表达式忽略字符集前面的单词，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/72075388/

node.js - 正则表达式忽略字符集前面的单词

上一篇：typescript - 如何从 Typescript 中导入的文件推断类型

下一篇：javascript - document.getElementById().value - 覆盖或知道何时访问