regex - 如何简化此正则表达式以在 Google Analytics 中使用

背景:谷歌分析

需要:一个过滤器，它接受给定的 URI 或 URN(是的 URN)，它返回所有被排除的查询字符串。

正如您可以想象的那样，那里有多种变体，我希望我已经在下面的列表中完整介绍了这些变体:

https://sub.domain.com/path/folder/article?l=en >> expected     https://sub.domain.com/path/folder/article
https://sub.domain.com/path/folder/103#3173l=en >> expected     https://sub.domain.com/path/folder/103   
https://sub.domain.com/path/folder/103?#3173l=en >> expected     https://sub.domain.com/path/folder/103
https://sub.domain.com/path/folder/103#?3173l=en
0sub.domain.tld  >> expected sub.domain.tld
sub.domain.tld/  >> expected sub.domain.tld
sub.domain.tld?param=value  >> expected sub.domain.tld
sub.domain.tld/?param=value  >> expected sub.domain.tld
sub.domain.tld?param=value#id  >> expected sub.domain.tld
sub.domain.tld/?param=value#id  >> expected sub.domain.tld
sub.domain.tld/folder  >> expected sub.domain.tld/folder
sub.domain.tld/folder/  >> expected sub.domain.tld/folder
sub.domain.tld/folder?param=value  >> expected   sub.domain.tld/folder
sub.domain.tld/folder/?param=value  >> expected  sub.domain.tld/folder
sub.domain.tld/1/folder  >> expected      sub.domain.tld/1/folder
sub.domain.tld/1/folder/  >> expected     sub.domain.tld/1/folder
2sub.domain.tld/1/folder?param=value
3sub.domain.tld/1/folder/?param=value
4sub.domain.tld#id
5sub.domain.tld/#id
6sub.domain.tld/1#id
7sub.domain.tld/1/#id

我无法解决的挑战是获得一个与子组中的事物匹配的正则表达式，该子组始终相同。

如果你必须玩，我已经保存了几个测试
- https://regex101.com/r/trZl06/1/
- https://regex101.com/r/SetgFn/2

后者在捕获我的案例时非常令人满意，但是一旦在现有匹配条件前面添加了捕获组，该组就会对不期望的单词进行 greps。

我也试过类似 ((.*)(?:[\/]\?.*)|(.*)(?:\?.*))|((.*)\/$|(.*))但是生成的子组总是不同的，这使得过滤器 View 中的引用有点困惑。

有什么你能想到的吗？

最佳答案

您可以使用

^([^#?]*?)([/?#]?\?.*|[/#]?#.*)?(/?)$

见 the regex demo .

详情

^ - 字符串开头

([^#?]*?) - 组 1:0 个或更多字符，而不是 #和 ? ，尽可能少

([/?#]?\?.*|[/#]?#.*)? - 可选的第 2 组:两者之一:

[/?#]?\?.* - 可选 / , ?或 #后面是 ? char 然后是字符串的其余部分

| - 或

[/#]?#.* - 可选 /或 #后面是 # char 然后是字符串的其余部分

(/?) - 第 3 组:可选 /

$ - 字符串结束。

关于regex - 如何简化此正则表达式以在 Google Analytics 中使用，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/53420990/

regex - 如何简化此正则表达式以在 Google Analytics 中使用

上一篇：sql-server - SSIS - 任务分组和序列任务有什么区别？

下一篇：dictionary - 在 Ansible 中，如何将角色中的默认字典与作为参数传递给该角色的字典相结合？