javascript - 如何从网页源中获取从这个到那个的字符串?

标签 javascript php

如何从网页源中获取从 this 到 that 的字符串?我查遍了 PHP.net,但无法弄清楚 PHP 是否有一个函数或一组函数可以从这个到那个抓取字符串。

例如,这就是我当前所拥有的(我想从 "wgCategories" 中存储的网页中获取从 "wgMonthNamesShort"$html 的所有内容) :

<?php
error_reporting(E_ALL);
$html = file_get_contents('http://en.wikipedia.org/wiki/Los_Angeles');
$string = <>;
?>

首先,我将网页的源代码放入 $html 变量中。现在我需要一个或一组函数来获取 "wgCategories" 中的所有内容。至"wgMonthNamesShort"并将其存储到 $string 中。

期望的结果:

$string = "wgCategories":["All articles with dead external links","Articles with dead external links from March 2013","Articles with dead external links from March 2014","Pages with broken reference names","Articles with dead external links from January 2014","Articles with dead external links from September 2011","Articles with dead external links from October 2011","CS1 errors: dates","Use mdy dates from May 2014","Wikipedia indefinitely semi-protected pages","Wikipedia indefinitely move-protected pages","Coordinates on Wikidata","Articles including recorded pronunciations","Articles containing Spanish-language text","All articles with unsourced statements","Articles with unsourced statements from December 2013","Spoken articles","Articles with hAudio microformats","Los Angeles, California","Cities in Los Angeles County, California","Communities on U.S. Route 66","County seats in California","Incorporated cities and towns in California","Populated coastal places in California","Populated places established in 1781","Port cities and towns of the United States Pacific coast","Butterfield Overland Mail in California","Stockton - Los Angeles Road"],"wgBreakFrames":false,"wgPageContentLanguage":"en","wgPageContentModel":"wikitext","wgSeparatorTransformTable":["",""],"wgDigitTransformTable":["",""],"wgDefaultDateFormat":"dmy","wgMonthNames":["","January","February","March","April","May","June","July","August","September","October","November","December"],"wgMonthNamesShort";

最后,请注意 "wgCategories" 中的所有内容至"wgMonthNamesShort"存储在 <script> 之间标签(不确定这是否重要,但有人告诉我值得一提)。

如果需要澄清,请告诉我。

最佳答案

您可以使用 preg_matchs 标志 (DOTALL) 来获取 2 个关键字之间的字符串:

error_reporting(E_ALL);
$html = file_get_contents('http://en.wikipedia.org/wiki/Los_Angeles');
if (preg_match('/wgCategories.*?wgMonthNamesShort/is', $html, $matches))
   echo $matches[0];

您可以避免使用正则表达式,并使用 PHP 字符串函数(例如 stristr)来实现这一点。

上面的代码打印:

wgCategories":["All articles with dead external links","Articles with dead external links from March 2013","Articles with dead external links from March 2014","Pages with broken reference names","Articles with dead external links from January 2014","Articles with dead external links from September 2011","Articles with dead external links from October 2011","CS1 errors: dates","Use mdy dates from May 2014","Wikipedia indefinitely semi-protected pages","Wikipedia indefinitely move-protected pages","Coordinates on Wikidata","Articles including recorded pronunciations","Articles containing Spanish-language text","All articles with unsourced statements","Articles with unsourced statements from December 2013","Spoken articles","Articles with hAudio microformats","Los Angeles, California","Cities in Los Angeles County, California","Communities on U.S. Route 66","County seats in California","Incorporated cities and towns in California","Populated coastal places in California","Populated places established in 1781","Port cities and towns of the United States Pacific coast","Butterfield Overland Mail in California","Stockton - Los Angeles Road"],"wgBreakFrames":false,"wgPageContentLanguage":"en","wgPageContentModel":"wikitext","wgSeparatorTransformTable":["",""],"wgDigitTransformTable":["",""],"wgDefaultDateFormat":"dmy","wgMonthNames":["","January","February","March","April","May","June","July","August","September","October","November","December"],"wgMonthNamesShort

关于javascript - 如何从网页源中获取从这个到那个的字符串?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25216541/

相关文章:

php - 如何存储搜索结果?

javascript - new.target 属性在绑定(bind)函数中表现出意外

javascript - 扁平化 babel 输出

php - MYSQL查询: Sort by input user based on numbers

php - 修改权限后使用php进行mysql插入操作失败

php - Codeigniter 数据库错误 : 1064 Using MySQL and maybe some session config error

php - 学说 1.2 性能,创造大量记录(~5000)

javascript - ESRI JavaScript API - 从 LocateButton 小部件返回坐标?

javascript - 不可变数组的问题

php - android mysql 在 TextView 中显示日期/时间