php - 简单的 html dom 抓取大型 html 文件

标签 php html parsing dom file-get-contents

我需要使用简单的 html dom 抓取一个大的 html 文件(例如:http://www.indianrail.gov.in/mail_express_trn_list.html)。我从一个简单的脚本开始:

<?php
require "simple_html_dom.php";
echo file_get_html('http://www.indianrail.gov.in/mail_express_trn_list.html')->plaintext;
?>

什么都不显示，只有一个空白页面，Apache error.log 文件中有错误消息

 PHP Notice:  Trying to get property of non-object in /var/www/index.php on line 3
 PHP Notice:  Trying to get property of non-object in /var/www/index.php on line 3

同时所有其他页面(例如:http://www.indianrail.gov.in/special_trn_list.html)都可以使用相同的脚本正常工作。

最佳答案

问题似乎是 simple_html_dom 中定义的 MAX_FILE_SIZE。

您可以通过编辑 simple_html_dom.php 文件中的 define('MAX_FILE_SIZE', 600000); 行来调整它。

关于php - 简单的 html dom 抓取大型 html 文件，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/17939101/

上一篇：php - 从网页打印旋转文本

下一篇：css - 从屏幕左侧到居中 div 末端的线

javascript - AJAX响应返回html内容

php - 如何将时间(秒)与名称相关联

html - css 背景颜色不适用于 <div>

javascript - 在php中生成的动态表中更改td的颜色

JQuery 查找 img src

php - 记录未从 android 插入到 mysql 数据库表

javascript - 使用 javascript 填充下拉列表中的数组元素

java - 如何将文件中的文本存储到变量中？

windows - 如何在 Windows 批处理文件中有条件地打开/关闭@ECHO？