php - XPath/Domdocument 按类名检查子级

标签 php xpath domdocument

我试图在 DOMDocument 节点循环中按特定类名(类名 ='foo' 的 div)查找子节点。如果存在,则应将我的 foo 值设置为 1:

我的 HTML $document 看起来像:

...
<div class="posts">Div Posts 1</div>
<div class="posts">Div Posts 2<div class="foo"></div></div>
<div class="posts">Div Posts 3</div>
<div class="posts">Div Posts 4<div class="foo"></div></div>
<div class="posts">Div Posts 5</div>
...

DOMDocument/Xpath ($document):

$html = array();
$document = new \DOMDocument();
$document->loadHTMLFile($url); // loads html from above
$xpath = new \DOMXPath($document);

$i=0;
foreach ($xpath->query(Parser::cssToXpath('.posts')) as $node) {
    $html['posts'][$i]['content'] = $node->nodeValue;  
    // check if child node with class name 'foo' exists => doesn't work :(
    $children = $node->getElementsByTagName('foo');
    if($children)
        $html['posts'][$i]['foo'] = '1';
    else
        $html['posts'][$i]['foo'] = '0';
    $i++;
}

输出:

[posts] => Array
    (
        [0] => Array
            (
                [content] => Div class Posts 1
                [foo] => 1
            )

        [1] => Array
            (
                [content] => Div class Posts 2
                [foo] => 1
            )

        [2] => Array
            (
                [content] => Div class Posts 3
                [foo] => 1
            )

        [3] => Array
            (
                [content] => Div class Posts 4
                [foo] => 1
            )

        [4] => Array
            (
                [content] => Div class Posts 5
                [foo] => 1
            )

    )

getElementsByTagName() 可能不是正确的方法,但我已经尝试了不同的方法,但没有找到正确的方法。 :(

最佳答案

根据您的评论

Hm yes but still doesn't work unfortunately. Eventually I need to know which .posts div has the child element 'foo' because I need to analyze the content of that parent and also need to replace it later
对于之前的答案,您的谓词可能是:

a) 选择 div 元素
b) 属性 class=posts
c) 并带有子元素 div
d) 具有属性 class=foo

作为 xpath 表达式:

a)//div
b)//div[ @class="posts"]
c)//div[ @class="posts"和 div ]
d)//div[ @class="posts"和 div[ @class="foo"] ]

例如

<?php
$doc = new DOMDocument;
$doc->loadhtml( getData() );
$xpath = new DOMXPath($doc);   

/*
use something like
    //div[contains(concat(' ',normalize-space(@class),' '),' post ')]
if the html element may have class="post lalala"
*/
foreach( $xpath->query('//div[@class="posts" and div[@class="foo"]]') as $post) {
    while ( $post->firstChild ) {
        $post->removeChild( $post->firstChild );
    }   
    $post->appendChild( $doc->createElement('span', 'The quick fox....') );
}
echo $doc->savehtml();


function getData() {
    return <<< eoh
<html><head><title>...</title></head><body>
    <div class="posts">Div Posts 1</div>
    <div class="posts">Div Posts 2<div class="foo"></div></div>
    <div class="posts">Div Posts 3</div>
    <div class="posts">Div Posts 4<div class="foo"></div></div>
    <div class="posts">Div Posts 5</div>
</body></html>
eoh;
}

打印

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><head><title>...</title></head><body>
    <div class="posts">Div Posts 1</div>
    <div class="posts"><span>The quick fox....</span></div>
    <div class="posts">Div Posts 3</div>
    <div class="posts"><span>The quick fox....</span></div>
    <div class="posts">Div Posts 5</div>
</body></html>

关于php - XPath/Domdocument 按类名检查子级,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/9362165/

相关文章:

java - 如何获取以下 sibling 的编号和姓名

php mysql 计算连续行

python - 访问搜索栏并使用 selenium 进行搜索

php - 内部页面 - 不希望它被抓取

python - 从网站收到的 DOM 没有 <a> 链接

php - 在 domdocument 中加载后字符串的结果不同

xpath - PHP + Wikipedia:从Wikipedia文章的第一段中获取内容?

PHP, DOMElement 只获取当前节点的值

php - CSS 选择器 - 有些东西不起作用

php - 尝试插入表单中的所有信息以在数据库中更新