所以我对以下代码有疑问:
def OnChanMsg(self, nick, channel, message):
if 'Username' in nick.GetNick():
stripped = message.s.strip() #strips leading and lagging whitespaces
regex = re.compile("\x1f|\x02|\x12|\x0f|\x16|\x03(?:\d{1,2}(?:,\d{1,2})?)?", re.UNICODE) #recompiles the mesasge minus colorcodes, bold etc
ircstripped = regex.sub("", stripped)
all = re.findall(r'test\ for\ (.*)\: ->\ (.*)\ \((.*)\)\ -\ \((.*)\)\ - \((.*)\).*', ircstripped)
所以我的问题如下:
1)除了 "(?:\d{1,2}(?:,\d{1,2})?)?"之外,代码的作用对我来说相对清楚
部分,我只是不明白它的作用和工作原理,我确实检查了谷歌开发人员codeschool视频,我还检查了python文档,但是当我的目标是去除IRC消息的颜色和其他各种格式时用(如果可能的话)外行人的话来说,这部分到底做了什么。
我在线程中发现了这个: How to strip color codes used by mIRC users?
(?: ... ) says to forget about storing what was found in the parenthesis (as we don't need to backreference it), ? means to match 0 or 1 and {n,m} means to match n to m of the previous grouping. Finally, \d means to match [0-9].
但我并没有真正明白=/
最佳答案
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
----------------------------------------------------------------------
\d{1,2} digits (0-9) (between 1 and 2 times
(matching the most amount possible))
----------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
----------------------------------------------------------------------
, ','
----------------------------------------------------------------------
\d{1,2} digits (0-9) (between 1 and 2 times
(matching the most amount possible))
----------------------------------------------------------------------
)? end of grouping
----------------------------------------------------------------------
)? end of grouping
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
因此,换句话说:可以选择捕获 1-2 位数字,也可以选择后跟由逗号和 1-2 位数字组成的组。
因此以下内容将匹配(假设全行匹配):
12
1
20
10,2
22,3
12,0
14,20
但以下不会:
200
a,b
!123p9
1000,2000
关于Python 正则表达式重新编译说明,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19696233/