我正在使用以下代码从 html dom 字符串中删除 javascript 并将它们放入数组中以备后用。
可以交替使用什么的好。
我的问题: 我在文件中遇到 unicode 问题。解析具有 unicode 的文件时,会生成以下错误:
Warning: DOMDocument::saveHTML() [domdocument.savehtml]: output conversion failed due to conv error, bytes 0x97 0xC3 0xA0 0xC2 in
我的代码:
function loadJSCodeToLast( $strDOM ){
//Find all the <script></script> code and add to $objApp
global $objApp;
$objDOM = new DOMDocument();
//$x = new DOMImplementation();
//$doc = $x->createDocument(NULL,"rootElementName");
//$strDOM = '<kool>'.$strDOM.'</kool>';
$objDOM->preserveWhiteSpace = false;
//$objDOM->formatOutput = true;
@$objDOM->loadHtml( $strDOM );
$xpath = new DOMXPath($objDOM);
$objScripts = $xpath->query('//script');
$totCount = $objScripts->length;
if ($totCount > 0) {
//document contains script tags
foreach($objScripts as $entries){
$strSrc = $entries->getAttribute('src');
if( $strSrc !== ''){
$objApp->AddJSFile( $strSrc );
}else{
$objApp->AddJSScript( $entries->nodeValue );
}
$entries->parentNode->removeChild( $entries );
}
}
//return $objDOM->saveHTML();
//echo $GLOBALS['strTemplateDirAbs'];
return preg_replace('/^<!DOCTYPE.+?>/', '', str_replace( array('<html>', '</html>', '<body>', '</body>'), array('', '', '', ''), $objDOM->saveHTML()));
}
最佳答案
在加载之前尝试使用 utf8_encode()
转换您的字符串。
$txt = utf8_encode($txt);
var_dump(loadJSCodeToLast($txt));
The XML parser converts the text of an XML document into UTF-8, even if you have set the character encoding of the XML, for example as a second parameter of the DOMDocument constructor. After parsing the XML with the load() command all its texts have been converted to UTF-8.
In case you append text nodes with special characters (e. g. Umlaut) to your XML document you should therefore use utf8_encode() with your text to convert it into UTF-8 before you append the text to the document. Otherwise you will get an error message like "output conversion failed due to conv error" at the save()
关于php - 加载剥离 javascript 并将其放入数组中供以后使用的替代方法是什么,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11876904/