我有一个捕获嵌套组的正则表达式,并且我想输出与这些组相关的嵌套 XML,就像 fn:analyze-string
一样。这是一个简单的例子:
正则表达式
((Luckenbach|Houston|Little Rock),\s(TX|AK))
输入
Let's go to Luckenbach, TX with Waylon and Willie and the boys.
期望的输出
<s:analyze-string-result xmlns:s="http://www.w3.org/2009/xpath-functions/analyze-string">
<s:non-match>Let's go to </s:non-match>
<s:match>
<s:group nr="1">
<s:group nr="2">Luckenbach</s:group>, <s:group nr="3">TX</s:group
</s:group>
</s:match>
<s:non-match> with Waylon and Willie and the boys.</s:non-match>
</s:analyze-string-result>
问题在于,似乎无法递归处理 xsl:matching 中的
(或以 XML 形式访问它们,如 xQuery fn:analyze-string())。xsl:analyze-string
中的 regex-group()
值-substring
该解决方案需要足够通用,才能与不同的正则表达式一起使用,其中许多正则表达式具有多个级别的嵌套捕获组。
最佳答案
当上下文节点包含示例文本时,以下内容会产生所需的输出:
<snip>
<xsl:analyze-string
select="."
regex="((Luckenbach|Houston|Little Rock),\s(TX|AK))">
<xsl:matching-substring>
<location>
<city><xsl:value-of select="regex-group(2)"/></city>
<xsl:text>, </xsl:text>
<state><xsl:value-of select="regex-group(3)"/></state>
</location>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</snip>
如果您只想生成 <snip>
如果正则表达式匹配,您可以稍微调整正则表达式和组的处理:
<xsl:analyze-string
select="."
regex="((.*)((Luckenbach|Houston|Little Rock),\s(TX|AK))(.*))">
<xsl:matching-substring>
<snip>
<xsl:value-of select="regex-group(2)"/>
<location>
<city><xsl:value-of select="regex-group(4)"/></city>
<xsl:text>, </xsl:text>
<state><xsl:value-of select="regex-group(5)"/></state>
</location>
<xsl:value-of select="regex-group(6)"/>
</snip>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
如果您想重现 XQuery 函数的行为 analyze-string() ,您可以定义自己的自定义函数:
<xsl:function name="my:analyze-string" as="item()*" xmlns:my="http://stackoverflow.com/questions/13187307/output-nested-regex-groups-as-nested-xml-using-xslanalyze-string">
<xsl:param name="val" />
<analyze-string-result xmlns="http://www.w3.org/2005/xpath-functions">
<xsl:analyze-string select="$val" regex="((.*)((Luckenbach|Houston|Little Rock),\s(TX|AK))(.*))">
<xsl:matching-substring>
<xsl:for-each select="1 to 6">
<xsl:if test="regex-group(.)">
<match>
<group nr="{.}">
<xsl:value-of select="regex-group(.)"/>
</group>
</match>
</xsl:if>
</xsl:for-each>
</xsl:matching-substring>
<xsl:non-matching-substring>
<non-match>
<xsl:value-of select="."/>
</non-match>
</xsl:non-matching-substring>
</xsl:analyze-string>
</analyze-string-result>
</xsl:function>
当像这样调用时:
<xsl:variable name="value"
select='"Let's go to Luckenbach, TX with Waylon and Willie and the boys."'/>
<xsl:copy-of select="my:analyze-string($value)"
xmlns:my="http://stackoverflow.com/questions/13187307/output-nested-regex-groups-as-nested-xml-using-xslanalyze-string"/>
它产生以下输出:
<analyze-string-result xmlns="http://www.w3.org/2005/xpath-functions"
xmlns:my="http://stackoverflow.com/questions/13187307/output-nested-regex-groups-as-nested-xml-using-xslanalyze-string">
<match>
<group nr="1">Let's go to Luckenbach, TX with Waylon and Willie and the boys.</group>
</match>
<match>
<group nr="2">Let's go to </group>
</match>
<match>
<group nr="3">Luckenbach, TX</group>
</match>
<match>
<group nr="4">Luckenbach</group>
</match>
<match>
<group nr="5">TX</group>
</match>
<match>
<group nr="6"> with Waylon and Willie and the boys.</group>
</match>
</analyze-string-result>
关于xml - 使用 xsl :analyze-string 将嵌套正则表达式组输出为嵌套 XML,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13187307/