php - 将简单的 HTML 表解析为 php 数组的 Xpath 循环问题

标签 php html xpath

所以我之前的问题: PHP Convert html table to JSON 很快就被认为是重复的而被驳回,而我仍在努力获得我需要的东西。我认为这主要是循环中的逻辑问题,我需要其他人来看看。

以此表为例:

<table id="Details" class="DATA_TABLE DATA_TABLE_WO_TOTAL">
  <tr>
    <th>Application</th>
    <th>Version number</th>
    <th>Virtual Administration Server</th>
    <th>Group</th>
    <th>Device</th>
    <th>Installed</th>
    <th>Last visible time</th>
    <th>Last connection to Administration Server</th>
    <th>IP address</th>
  </tr>
  <tr>
    <td class="sD">some text</td>
    <td class="sD">10.2.5.3201</td>
    <td class="sD"></td>
    <td class="sD">Thin PC</td>
    <td class="sD">PC#</td>
    <td class="sD">date</td>
    <td class="sD">date</td>
    <td class="sD">date</td>
    <td class="sD">ip address</td>
  </tr>
  <tr>
     <tr>
    <td class="sD">some more text</td>
    <td class="sD">10.2.5.3201</td>
    <td class="sD"></td>
    <td class="sD">Thin PC</td>
    <td class="sD">PC#</td>
    <td class="sD">date</td>
    <td class="sD">date</td>
    <td class="sD">date</td>
    <td class="sD">ip address</td>
  </tr>
</table>

我需要创建一个数组(稍后可以将其转换为json),其中 th 标签是键,然后每个 tr 内的所有 td 标签是与这些键对应的数据。我有以下 php 代码:

<?php
$dom = new DOMDocument;
$dom->loadHTML($cleantable2); //this is the table above
$xpath = new DOMXPath($dom);

foreach($xpath->query('//table/tr') as $tr){
        $tmp = [];
                foreach($xpath->query('//table/tr/th', $tr) as $th){
                        $key = $th->textContent;
                        foreach($xpath->query('td', $tr) as $td){
                                $tmp[$key] = trim($td->textContent);
                        }
                }
                $result[]=$tmp;
        }
var_dump($result);

?>

它确实得到了正确的 key ,但没有得到数据,示例输出:

  [89]=>
  array(9) {
    ["Application"]=>
    string(13) "192.168.6.104"
    ["Version number"]=>
    string(13) "192.168.6.104"
    ["Virtual Administration Server"]=>
    string(13) "192.168.6.104"
    ["Group"]=>
    string(13) "192.168.6.104"
    ["Device"]=>
    string(13) "192.168.6.104"
    ["Installed"]=>
    string(13) "192.168.6.104"
    ["Last visible time"]=>
    string(13) "192.168.6.104"
    ["Last connection to Administration Server"]=>
    string(13) "192.168.6.104"
    ["IP address"]=>
    string(13) "192.168.6.104"
  }

如您所见,它仅获取每个 key 的 IP 地址,而不获取其余数据。我究竟做错了什么?有人可以帮忙而不只是将其视为重复项吗?我已经尝试解决这个问题超过一天了,我很确定我的问题只是没有正确循环,但我没有看到它......

谢谢

最佳答案

$strhtml='
<table id="Details" class="DATA_TABLE DATA_TABLE_WO_TOTAL">
  <tr>
    <th>Application</th>
    <th>Version number</th>
    <th>Virtual Administration Server</th>
    <th>Group</th>
    <th>Device</th>
    <th>Installed</th>
    <th>Last visible time</th>
    <th>Last connection to Administration Server</th>
    <th>IP address</th>
  </tr>
  <tr>
    <td class="sD">some text</td>
    <td class="sD">10.2.5.202</td>
    <td class="sD">Plato</td>
    <td class="sD">Thin PC</td>
    <td class="sD">PC#</td>
    <td class="sD">date a</td>
    <td class="sD">date b</td>
    <td class="sD">date c</td>
    <td class="sD">10.25.100.1</td>
  </tr>
  <tr>
     <tr>
    <td class="sD">some more text</td>
    <td class="sD">10.2.5.321</td>
    <td class="sD">Socrates</td>
    <td class="sD">Thick PC</td>
    <td class="sD">PC#</td>
    <td class="sD">date x</td>
    <td class="sD">date y</td>
    <td class="sD">date z</td>
    <td class="sD">10.25.100.2</td>
  </tr>
</table>';

鉴于上面的 html 片段,也许以下内容可以满足您的需求?评论应该有助于了解我所做的事情

libxml_use_internal_errors( true );
$dom=new DOMDocument;
$dom->loadHTML( $strhtml );
libxml_clear_errors();

$xp=new DOMXPath( $dom );
/* find the `th` elements */
$col = $xp->query( '//tr/th' );

/* temp arrays */
$tmp=$out=$keys=array();


if( $col->length > 0 ){
    /* get all headers as keys */
    foreach( $col as $node )$keys[]=$node->nodeValue;

    /* get all table cell data - store in single array */
    $col=$xp->query( '//tr/td[ @class="sD" ]' );
    foreach( $col as $node )$tmp[]=$node->nodeValue;

    /* split data into chunks according to number of columns */
    $rows=array_chunk( $tmp, count( $keys ) );

    /* combine keys and chunks */
    foreach( $rows as $row ){
        $tmp=array();
        foreach( $row as $i => $value ) $tmp[ $keys[ $i ] ]=$value;
        $out[]=$tmp;
    }

    echo json_encode( $out );
}

输出:

[
    {
        "Application":"some text",
        "Version number":"10.2.5.202",
        "Virtual Administration Server":"Plato",
        "Group":"Thin PC",
        "Device":"PC#",
        "Installed":"date a",
        "Last visible time":"date b",
        "Last connection to Administration Server":"date c",
        "IP address":"10.25.100.1"
    },
    {
        "Application":"some more text",
        "Version number":"10.2.5.321",
        "Virtual Administration Server":"Socrates",
        "Group":"Thick PC","Device":"PC#",
        "Installed":"date x",
        "Last visible time":"date y",
        "Last connection to Administration Server":"date z",
        "IP address":"10.25.100.2"
    }
]

关于php - 将简单的 HTML 表解析为 php 数组的 Xpath 循环问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54900521/

相关文章:

html - BEM 方法 : Different Questions (Resetting, 混合或修饰符)

javascript - 使 querySelector 仅针对单击的此元素?

javascript - 如何在网页上找到这个元素?

xml - 仅当节点存在时,XPath 才获取 sibling 计数

php - 在mysql查询中对大小写敏感

php - 如何访问 View::composer 上的 url 参数

php - Python中用变量修改参数名

php - 检查多维数组中是否存在值

php - 使用 cURL 的 Bug 订阅 instagram API

xpath - Scrapy xpath无法在网页中找到某些div