php - DOM 解析器突出显示无效的关键字

标签 php html dom highlighting

这个问题和我做的一个有关before但是因为这个话题现在已经结束了,我需要进一步问一些问题,我会开始一个新的问题,希望没问题。

在我之前的回答中,我充分简化了问题并得出了简单但不完全有效的解决方案。这些天我在实现我的代码时意识到了这一点。

上一篇文章中的解决方案存在的问题是 HTML 标签被替换函数破坏了。我在该站点的许多帖子中都读到了我需要使用 DOM 解析器。我对此很不熟悉,我尝试了这个 post 中用户“ircmaxell”建议的代码,但这对我不起作用。

这是我所做的示例:

echo '<style type="text/css">
       .ht{
         background-color: yellow;
       }
     </style>'; 


/* taken from user ircmaxell at https://stackoverflow.com/questions/4081372/highlight-keywords-in-a-paragraph

I just modified line $highlight->setAttribute('class', 'highlight') to $highlight->setAttribute('class', 'ht') and commented the first 2 lines   */

function highlight_paragraph($string, $keyword) {
  //$string = '<p>foo<b>bar</b></p>';
  //$keyword = 'foo';
  $dom = new DomDocument();
  $dom->loadHtml($string);
  $xpath = new DomXpath($dom);
  $elements = $xpath->query('//*[contains(.,"'.$keyword.'")]');
  foreach ($elements as $element) {
   foreach ($element->childNodes as $child) {
     if (!$child instanceof DomText) continue;
     $fragment = $dom->createDocumentFragment();
     $text = $child->textContent;
     $stubs = array();
     while (($pos = stripos($text, $keyword)) !== false) {
       $fragment->appendChild(new DomText(substr($text, 0, $pos)));
       $word = substr($text, $pos, strlen($keyword));
       $highlight = $dom->createElement('span');
       $highlight->appendChild(new DomText($word));
       $highlight->setAttribute('class', 'ht');
       $fragment->appendChild($highlight);
       $text = substr($text, $pos + strlen($keyword));
     }
     if (!empty($text)) $fragment->appendChild(new DomText($text));
     $element->replaceChild($fragment, $child);
   }
 }
 $string = $dom->saveXml($dom->getElementsByTagName('body')->item(0)->firstChild);
 return $string;
}


$string = '<p>This book has been written against a background of both reckless optimism and reckless despair.</p>
<p>It holds that Progress and Doom are two sides of the same medal; that both are articles of superstition, not of faith. It was written out of the conviction that it should be possible to discover the hidden mechanics by which all traditional elements of our political and spiritual world were dissolved into a conglomeration where everything seems to have lost specific value, and has become unrecognizable for human comprehension, unusable for human purpose.</p>
<p> Hannah Arendt, The Origins of Totalitarianism (New York: Harcourt Brace Jovanovich, Inc., 1973 ed.), p.vii, Preface to the First Edition.</p>';

$keywords = array('This', 'book', 'has', 'been', 'written', 'background', 'reckless', 'optimism', 'despair.', 'holds', 'Progress', 'Doom ', 'two', 'sides', 'medal;', 'articles', 'superstition,', 'faith.', 'lost', 'Arendt,', 'Totalitarianism');

foreach ($keywords as $kw) {
  $string = highlight_paragraph($string, $kw);
}

echo $string;

echo $string 只返回:

This book has been written against a background of both reckless optimism and reckless despair.

并且只有前两个词“This”和“book”被突出显示。

正常情况下,它应该输出所有带有高亮关键字的初始字符串。

我在 stackoverflow 和 google 中搜索了很多,但没有找到一个易于使用的代码来实现我的目的,即使之前有很多人问过同样的事情。

我真的需要这里的帮助。提前致谢!

最佳答案

你很幸运,当我看到这个问题时非常无聊。 ;)

您收到的作为答案的代码似乎没有经过测试 - 我不知道它怎么可能正常工作。无论如何,我解决了所有问题并向您展示了一个工作版本 - 在我本地安装的 Apache 服务器上测试了 PHP 5.3:

function highlight_paragraph($string, $keyword) {
  $dom = new DOMDocument();
  $dom->loadHtml($string);

  // Search for all text blocks containing the keyword
  $xpath = new DOMXpath($dom);
  $textNodes = $xpath->query('//*[contains(.,"'.$keyword.'")]/text()');

  foreach ($textNodes as $textNode) {
    $fragment = $dom->createDocumentFragment();
    $text = $textNode->nodeValue;
    $stubs = array();

    while (($pos = stripos($text, $keyword)) !== false) {
      $fragment->appendChild(new DOMText(substr($text, 0, $pos)));
      $word = substr($text, $pos, strlen($keyword));

      $highlight = $dom->createElement('span');
      $highlight->appendChild(new DOMText($word));
      $highlight->setAttribute('class', 'ht');
      $fragment->appendChild($highlight);

      $text = substr($text, $pos + strlen($keyword));
    }

    if (!empty($text))
      $fragment->appendChild(new DOMText($text));

    $textNode->parentNode->replaceChild($fragment, $textNode);
 }

 return $dom->saveHTML();
}

关于php - DOM 解析器突出显示无效的关键字,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/9335689/

相关文章:

html - 如何安装适用于 Windows 的 WebKit(谷歌浏览器)?

javascript - 如何通过javascript访问网络摄像头

php - 如何在 mysql 数据库表中获取第一个和最后一个日期

php - 如何使用foreach遍历两个相同长度的集合

iphone - 错误 : Whitelist rejection in Phonegap

javascript - 如何使用 javascript(Vanilla 或 JQuery)删除一系列行

javascript - 是否可以将第 3 方 js 脚本注入(inject)到文档中已有的注释中?

php - 我应该关闭 cURL 吗?

php - 授权 :user() is showing null in Laravel 7 particular to the controller UserController

jquery - 使用 jquery 将函数添加到 dom 元素