我已经编写了自己的 CSS 压缩器来获得乐趣和利润(利润不多),而且效果很好。我现在正在尝试简化它,因为我实际上是在过滤文件 10 次以上。小文件没什么大不了的,但文件越大,性能影响就越大。
是否有更优雅的方式来过滤我的输入文件?我假设正则表达式会有办法,但我不是正则表达式向导......
$a = (gc($path + $file) -Raw)
$a = $a -replace "\s{2,100}(?<!\S)", ""
$a = $a -replace " {", "{"
$a = $a -replace "} ", "}"
$a = $a -replace " \(", "\("
$a = $a -replace "\) ", "\)"
$a = $a -replace " \[", "\["
$a = $a -replace "\] ", "\]"
$a = $a -replace ": ", ":"
$a = $a -replace "; ", ";"
$a = $a -replace ", ", ","
$a = $a -replace "\n", ""
$a = $a -replace "\t", ""
为了让您省去一点麻烦,我基本上使用第一个 -replace 来去除长度为 2-100 个字符的任何连续空白。 其余的替换语句涵盖了在特定情况下清理单个空格。
我怎样才能合并它,所以我不会过滤文件 12 次?
最佳答案
negative lookbehind
(?<!\S)
在这种情况下使用:(?<!prefix)thing
匹配左边没有前缀的东西。当你把它放在正则表达式的末尾时,后面什么也没有,我认为它什么都不做。您可能打算将其放在左侧,或者可能打算以负面方式向前看,我不会尝试猜测,我只是将其删除以获得此答案。 p>您没有使用 character classes .
abc
查找文本abc
, 但将它们放在方括号和[abc]
中寻找任何字符a
,b
,c
.- 使用它,您可以将最后两行合并为一行:
[\n\t]
替换换行符或制表符。
- 使用它,您可以将最后两行合并为一行:
您可以使用正则表达式逻辑或
|
组合两个单独的(不替换)规则进行一场比赛:\s{2,100}|[\n\t]
- 匹配空格或换行符或制表符。 (你可能会使用 OR 两次而不是字符,fwiw)。使用 regex capture groups这允许您引用 正则表达式匹配的任何内容,而无需事先知道那是什么。
例如
"space bracket -> bracket"
和"space colon -> colon"
和"space comma -> comma"
都遵循一般模式"space (thing) -> (thing)"
.与尾随空格相同"(thing) space -> (thing)"
.将捕获组与字符类合并,将其余行合并为一个。
例如
$a -replace " (:)", '$1' # capture the colon, replacement is not ':'
# it is "whatever was in the capture group"
$a -replace " ([:,])", '$1' # capture the colon, or comma. Replacement
# is "whatever was in the capture group"
# space colon -> colon, space comma -> comma
# make the space optional with \s{0,1} and put it at the start and end
\s{0,1}([:,])\s{0,1} #now it will match "space (thing)" or "(thing) space"
# Add in the rest of the characters, with appropriate \ escapes
# gained from [regex]::Escape('those chars here')
# Your original:
$a = (gc D:\css\1.css -Raw)
$a = $a -replace "\s{2,100}(?<!\S)", ""
$a = $a -replace " {", "{"
$a = $a -replace "} ", "}"
$a = $a -replace " \(", "\("
$a = $a -replace "\) ", "\)"
$a = $a -replace " \[", "\["
$a = $a -replace "\] ", "\]"
$a = $a -replace ": ", ":"
$a = $a -replace "; ", ";"
$a = $a -replace ", ", ","
$a = $a -replace "\n", ""
$a = $a -replace "\t", ""
# My version:
$b = gc d:\css\1.css -Raw
$b = $b -replace "\s{2,100}|[\n\t]", ""
$b = $b -replace '\s{0,1}([])}{([:;,])\s{0,1}', '$1'
# Test that they both do the same thing on my random downloaded sample file:
$b -eq $a
# Yep.
用另一个 |
再做一次将两者合而为一:
$c = gc d:\css\1.css -Raw
$c = $c -replace "\s{2,100}|[\n\t]|\s{0,1}([])}{([:;,])\s{0,1}", '$1'
$c -eq $a # also same output as your original.
NB. that the space and tab and newline capture nothing, so '$1' is empty,
which removes them.
而且您可以花费大量时间来构建您自己的不可读的正则表达式,这在任何实际场景中可能不会明显更快。 :)
注意。 '$1'
在替换中,美元是 .Net 正则表达式引擎语法,而不是 PowerShell 变量。如果您使用双引号,PowerShell 将从变量 $1 进行字符串插值,并可能将其替换为任何内容。
关于Powershell 中的 RegEx,合并替换调用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40430834/