html - 为什么 XPath contains(text() ,'substring' ) 没有按预期工作？

假设我有一段这样的 HTML:

<a>Ask Question<other/>more text</a>

我可以匹配这段XPath:

//a[text() = 'Ask Question']

或者...

//a[text() = 'more text']

或者我可以使用点来匹配整个事物:

//a[. = 'Ask Questionmore text']

This post描述了 .(点)和 text() 之间的差异，但简而言之，前者返回单个元素，后者返回元素列表。但这对我来说有点奇怪。因为虽然 text() 可用于匹配列表中的任一元素，但在涉及 XPath 函数 contains() 时情况并非如此。如果我这样做:

//a[contains(text(), 'Ask Question')]

...我收到以下错误:

Error: Required cardinality of first argument of contains() is one or zero

为什么 text() 在使用完全匹配(等于)时有效，但在部分匹配(包含)时不起作用？

最佳答案

对于这个标记，

<a>Ask Question<other/>more text</a>

注意 a 元素有一个文本节点子节点 ("Ask Question")，一个空元素子节点 (other)，以及第二个文本节点子节点(“更多文本”)。

下面是如何根据该标记评估 //a[contains(text(),'Ask Question')] 时发生的事情:

contains(x,y) 期望 x 是一个字符串，但是 text() 匹配两个文本节点。
在XPath 1.0中，将多个节点转换为字符串的规则是this :

A node-set is converted to a string by returning the string-value of the node in the node-set that is first in document order. If the node-set is empty, an empty string is returned. [Emphasis added]

在 XPath 2.0+ 中，向需要字符串的函数提供文本节点序列是错误的，因此 contains(text(),'substr') 将导致超过一个匹配文本节点的错误。

在你的情况下......

XPath 1.0 会将 contains(text(),'Ask Question') 视为
```
contains('Ask Question','Ask Question')
```
为真。另一方面，请务必注意 contains(text(),'more text') 在 XPath 1.0 中的计算结果为 false。在不知道上面的 (1)-(3) 的情况下，这可能是违反直觉的。
XPath 2.0 会将其视为错误。

更好的选择

如果目标是找到字符串值包含子字符串的所有 a 元素，
“提问”:
```
//a[contains(.,'Ask Question')]
```
这是最常见的要求。
如果目标是找到所有 a 元素，其直接文本节点子节点等于 "Ask Question":
```
//a[text()='Ask Question']
```
这在希望从 a 中的后代元素中排除字符串时非常有用，例如如果您想要这个 a,
```
<a>Ask Question<other/>more text</a>
```
但不是这个a:
```
<a>more text before <not>Ask Question</not> more text after</a>
```

另见

关于html - 为什么 XPath contains(text() ,'substring' ) 没有按预期工作？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/69909751/

html - 为什么 XPath contains(text() ,'substring' ) 没有按预期工作？

更好的选择

另见

上一篇：generics - 为所有使用 const 参数实现特征的类型实现特征

下一篇：c++ - 是否有任何特殊的 C++ 函数来获取数组所有元素的异或？