我是下面的函数,我正在努力输出 DOMDocument 而不在输出之前附加 XML、HTML、body 和 p 标记包装器内容。建议的修复:
$postarray['post_content'] = $d->saveXML($d->getElementsByTagName('p')->item(0));
仅当内容中没有 block 级元素时才有效。但是,当它这样做时,如下例中使用 h1 元素的示例,saveXML 的结果输出将被截断为...
<p>If you like</p>
有人指出这篇文章是一种可能的解决方法,但我不明白如何将它实现到这个解决方案中(请参阅下面注释掉的尝试)。
有什么建议吗?
function rseo_decorate_keyword($postarray) {
global $post;
$keyword = "Jasmine Tea"
$content = "If you like <h1>jasmine tea</h1> you will really like it with Jasmine Tea flavors. This is the last ocurrence of the phrase jasmine tea within the content. If there are other instances of the keyword jasmine tea within the text what happens to jasmine tea."
$d = new DOMDocument();
@$d->loadHTML($content);
$x = new DOMXpath($d);
$count = $x->evaluate("count(//text()[contains(translate(., 'ABCDEFGHJIKLMNOPQRSTUVWXYZ', 'abcdefghjiklmnopqrstuvwxyz'), '$keyword') and (ancestor::b or ancestor::strong)])");
if ($count > 0) return $postarray;
$nodes = $x->query("//text()[contains(translate(., 'ABCDEFGHJIKLMNOPQRSTUVWXYZ', 'abcdefghjiklmnopqrstuvwxyz'), '$keyword') and not(ancestor::h1) and not(ancestor::h2) and not(ancestor::h3) and not(ancestor::h4) and not(ancestor::h5) and not(ancestor::h6) and not(ancestor::b) and not(ancestor::strong)]");
if ($nodes && $nodes->length) {
$node = $nodes->item(0);
// Split just before the keyword
$keynode = $node->splitText(strpos($node->textContent, $keyword));
// Split after the keyword
$node->nextSibling->splitText(strlen($keyword));
// Replace keyword with <b>keyword</b>
$replacement = $d->createElement('strong', $keynode->textContent);
$keynode->parentNode->replaceChild($replacement, $keynode);
}
$postarray['post_content'] = $d->saveXML($d->getElementsByTagName('p')->item(0));
// $postarray['post_content'] = $d->saveXML($d->getElementsByTagName('body')->item(1));
// $postarray['post_content'] = $d->saveXML($d->getElementsByTagName('body')->childNodes);
return $postarray;
}
最佳答案
所有这些答案现在都错误,因为从 PHP 5.4 和 Libxml 2.6 开始 loadHTML
现在有一个$option
指示 Libxml 如何解析内容的参数。
因此,如果我们使用这些选项加载 HTML
$html->loadHTML($content, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
在做 saveHTML()
时不会有doctype
, 没有 <html>
, 没有 <body>
.
LIBXML_HTML_NOIMPLIED
turns off the automatic adding of implied html/body elementsLIBXML_HTML_NODEFDTD
prevents a default doctype being added when one is not found.
关于 Libxml 参数的完整文档是 here
(请注意 loadHTML
文档说需要 Libxml 2.6,但 LIBXML_HTML_NODEFDTD
仅在 Libxml 2.7.8 中可用,LIBXML_HTML_NOIMPLIED
在 Libxml 2.7.7 中可用)
关于php - 如何在没有 HTML 包装器的情况下保存 DOMDocument 的 HTML?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/4879946/