python - PyParsing 不同的字符串长度

标签 python parsing firewall rules pyparsing

我正在为防火墙配置文件编写一个解析器。 一般来说,我对 PyParsing 和 Python 很陌生。

问题是如果出现超过 3 个参数,我该如何解析,(xxxx,xxxx,xxxx) != (xxxx,xxxx,xxxx,xxxx),如果每行不包含任何规则,则所有规则都工作正常并正确解析所有内容多于3个字符串,但我们可以看到[防火墙[F1]]地址字段后包含“NAT”,无论我们如何更改规则都会被忽略。

使用 (def printTokens(s,loc,toks): #s=原始字符串, loc=位置, toks=匹配的标记)

请查看使用第四个参数(“NAT”)和删除它时的 2 个输出。 预先感谢!需要解析所有内容,包括“NAT”并实现规则。

from pyparsing import *

#===================================GRAMMER==================================
zone = Literal("zone")    
zoneid = Word(alphanums)
host = Literal("host")
hostid = Word(alphanums)
interface = Literal("interface")
interfaceid = Word(alphanums)
firewall = Literal("firewall")
firewallid = Word(alphanums)
router = Literal("router")
routerid = Word(alphanums)

fstop = Literal(".")
comma = Suppress(",") #Converter for ignoring the results of a parsed expression.
slash = Literal("/")
ocbracket = Literal("{")
ccbracket = Literal("}")
sobracket = Literal("[")
scbracket = Literal("]")
hyphen = Literal("-")
underline = Literal("_") 
word = Word(alphas)


#===================================IP-TYPE=================================

ip=Combine(Word(nums)+            
        fstop+ Word(nums) + 
        fstop+ Word(nums) + 
        fstop + Word(nums))

subnet = Combine(slash +Word(nums))

address = ip + Optional(subnet)


#===================================RULES===================================

#adword = address + word

zoneRule = zone + zoneid + address
hostRule = host + hostid + ocbracket
interfaceRule = interface + interfaceid + address 
interfaceRule2 = interface + interfaceid + address + word
firewallRule = firewall + firewallid + ocbracket
routerRule = router + routerid + ocbracket

endRule = ccbracket


rule = zoneRule | hostRule | interfaceRule | interfaceRule2 | firewallRule | routerRule | endRule 
rules = OneOrMore(rule)

#===================================DATA=====================================
details = """zone zone1 10.1.0.0/24                   
         zone backbone 10.254.0.0/24
         zone zone 10.2.0.0/24
         host ha {
             interface iha 10.1.0.1
         }
         host hb {
            interface ihb 10.2.0.1
         }
         firewall f1 {
            interface ifla 10.1.0.254 
            interface iflback 10.254.0.101 nat
         }
         router r2 {
            interface ir2back 10.254.0.102
         }
         router r3 {
            interface ir3b 10.2.0.103
         }"""

#==================================METHODS==================================

    def printTokens(s,loc,toks):   #s=orig string, loc=location, toks=matched tokens
    print (toks)

zoneRule.setParseAction(printTokens) 
hostRule.setParseAction(printTokens)
interfaceRule.setParseAction(printTokens)
interfaceRule2.setParseAction(printTokens) #takes in 4 instances where as 3 declared
firewallRule.setParseAction(printTokens)
routerRule.setParseAction(printTokens)
endRule.setParseAction(printTokens)

rules.parseString(details)


#================================OUTPUT RESULT WITH NAT=================================
"""
['zone', 'zone1', '10.1.0.0', '/24']
['zone', 'backbone', '10.254.0.0', '/24']
['zone', 'zone', '10.2.0.0', '/24']
['host', 'ha', '{']
['interface', 'iha', '10.1.0.1']        
['}']
['host', 'hb', '{']
['interface', 'ihb', '10.2.0.1']
['}']
['firewall', 'f1', '{']
['interface', 'ifla', '10.1.0.254']
['interface', 'iflback', '10.254.0.101']"""
#================================OUTPUT RESULT WITHOUT NAT=================================
"""['zone', 'zone1', '10.1.0.0', '/24']
['zone', 'backbone', '10.254.0.0', '/24']
['zone', 'zone', '10.2.0.0', '/24']
['host', 'ha', '{']
['interface', 'iha', '10.1.0.1']
['}']
['host', 'hb', '{']
['interface', 'ihb', '10.2.0.1']
['}']
['firewall', 'f1', '{']
['interface', 'ifla', '10.1.0.254']
['interface', 'iflback', '10.254.0.101']
['}']
['router', 'r2', '{']
['interface', 'ir2back', '10.254.0.102']
['}']
['router', 'r3', '{']
['interface', 'ir3b', '10.2.0.103']
['}']"""

最佳答案

如果要使用特定分隔符匹配任意数量的表达式,请使用 PyParsing's delimitedList 。默认情况下,它允许分隔符周围有空格;添加 combine=True 以不需要空格。

但是,如果您想在语法中允许可选项目,则只需添加一个可选项目即可。对于您的接口(interface)规则,您可以替换:

interfaceRule = interface + interfaceid + address 
interfaceRule2 = interface + interfaceid + address + word

与:

interfaceRule = interface + interfaceid + address + Optional(word)

最后,您发布的代码的实际问题是您正在使用 | 运算符,它是 MatchFirst 的简写形式。 。 MatchFirst 将按顺序尝试给定选项,并返回第一个匹配的结果。如果您使用Or相反,其简写形式是 ^ 运算符,那么它将尝试所有选项并返回最长选项> 匹配。

关于python - PyParsing 不同的字符串长度,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38478877/

相关文章:

python - 如何在python中的线程循环中引发异常?

python - 使用 PyEphem 将地心坐标(方位角、高程)转换为赤道坐标(RA、Dec)

c++ - 如何编写 `std::istream` 运算符

javascript - 如何检查客户端 JavaScript 中的端口可用性?

sql-server - 从 SQL Server Management Studio 添加新的防火墙规则时出现 "Account has no subscriptions"

python - 集合中的命名约定 : why are some lowercase and others CapWords?

python - Pandas read_csv 防止文件中的引号成为数据的一部分

Python PDFMiner : How to link outlines to underlying text

perl - 需要帮助迭代特定格式的文件

linux - 为什么 docker 中的拉取过程是这样的?