我正在尝试解析:[www.neiu.edu/~neiutemp/PhoneBook/alpha.htm] 使用 TFHPPLE 解析器,我正在寻找表中每个 TR(行)的第一个 TD(第一列) .这里 TD 的所有属性都是相同的。我无法区分 TD。
我能够获取所有 HTML 代码,但无法从每个 TR 获取第一个 TD。在 //3
(在代码中)tutorialsNodes 之后为空。的输出
NSLog(@"Nodes are : %@",[tutorialsNodes description]);
是
Practice1[62351:c07] Nodes are : ().
I can't see what's wrong. Any help would be appreciated. My code to parse this URL:
NSURL *tutorialsUrl = [NSURL URLWithString:@"http://www.neiu.edu/~neiutemp/PhoneBook/alpha.htm"];
NSData *tutorialsHtmlData = [NSData dataWithContentsOfURL:tutorialsUrl];
// 2
TFHpple *tutorialsParser = [TFHpple hppleWithHTMLData:tutorialsHtmlData];
// 3
NSString *tutorialsXpathQueryString = @"//TR/TD";
NSArray *tutorialsNodes = [tutorialsParser searchWithXPathQuery:tutorialsXpathQueryString];
NSLog(@"Nodes are : %@",[tutorialsNodes description]);
// 4
NSMutableArray *newTutorials = [[NSMutableArray alloc] initWithCapacity:0];
for (TFHppleElement *element in tutorialsNodes) {
// 5
Tutorial *tutorial = [[Tutorial alloc] init];
[newTutorials addObject:tutorial];
// 6
tutorial.title = [[element firstChild] content];
// 7
tutorial.url = [element objectForKey:@"href"];
NSLog(@"title is: %@",[tutorial.title description]);
}
// 8
_objects = newTutorials;
[self.tableView reloadData];
最佳答案
如果您使用 @"//tr/td"
而不是 @"//TR/TD"
,这应该可以工作。
不过,看看您的 HTML,因为它的作者显然不知道如何拼写 CSS,所以您在整个源代码中都隐藏了字体标签。因此,您的下一行代码显然取自优秀的 Hpple tutorial by Matt Galloway on Ray Wenderlich's site , 说:
tutorial.title = [[element firstChild] content];
但这在这里行不通,因为对于您的大部分条目,firstChild
不是文本
,而是一种字体
标签。所以你可以检查它是否是这样的字体标签:
TFHppleElement *subelement = [element firstChild];
if ([[subelement tagName] isEqualToString:@"font"])
subelement = [subelement firstChild];
tutorial.title = [subelement content];
或者,您可以只搜索 @"//tr/td/font"
而不是 @"//tr/td"
。这里有很多方法。诀窍(就像所有 HTML 解析一样)是使其相当健壮,这样您就不会对页面进行微小的外观调整。
显然,您的 HTML 中没有 URL,因此该代码不适用于此处。
无论如何,我希望这足以让您继续前进。
您报告有问题,所以我想我应该提供更完整的代码示例:
NSURL *tutorialsUrl = [NSURL URLWithString:@"http://www.neiu.edu/~neiutemp/PhoneBook/alpha.htm"];
NSData *tutorialsHtmlData = [NSData dataWithContentsOfURL:tutorialsUrl];
TFHpple *tutorialsParser = [TFHpple hppleWithHTMLData:tutorialsHtmlData];
NSString *tutorialsXpathQueryString = @"//tr/td";
NSArray *tutorialsNodes = [tutorialsParser searchWithXPathQuery:tutorialsXpathQueryString];
if ([tutorialsNodes count] == 0)
NSLog(@"nothing there");
else
NSLog(@"There are %d nodes", [tutorialsNodes count]);
NSMutableArray *newTutorials = [[NSMutableArray alloc] initWithCapacity:0];
for (TFHppleElement *element in tutorialsNodes) {
Tutorial *tutorial = [[Tutorial alloc] init];
[newTutorials addObject:tutorial];
TFHppleElement *subelement = [element firstChild];
if ([[subelement tagName] isEqualToString:@"font"])
subelement = [subelement firstChild];
tutorial.title = [subelement content];
NSLog(@"title is: %@", [tutorial.title description]);
}
这会产生以下输出:
2013-05-10 19:39:42.027 hpple-test[33881:c07] There are 10773 nodes 2013-05-10 19:39:42.028 hpple-test[33881:c07] title is: A 2013-05-10 19:39:46.027 hpple-test[33881:c07] title is: (null) 2013-05-10 19:39:46.698 hpple-test[33881:c07] title is: (null) 2013-05-10 19:39:47.333 hpple-test[33881:c07] title is: (null) 2013-05-10 19:39:47.827 hpple-test[33881:c07] title is: (null) 2013-05-10 19:39:48.358 hpple-test[33881:c07] title is: (null) 2013-05-10 19:39:49.133 hpple-test[33881:c07] title is: (null) 2013-05-10 19:39:49.775 hpple-test[33881:c07] title is: Abay, Hiwet B 2013-05-10 19:39:50.326 hpple-test[33881:c07] title is: H-Abay 2013-05-10 19:39:50.992 hpple-test[33881:c07] title is: 773-442-5140 2013-05-10 19:39:51.597 hpple-test[33881:c07] title is: (null) 2013-05-10 19:39:52.092 hpple-test[33881:c07] title is: Controller 2013-05-10 19:39:52.598 hpple-test[33881:c07] title is: E 2013-05-10 19:39:53.149 hpple-test[33881:c07] title is: 223 2013-05-10 19:39:55.040 hpple-test[33881:c07] title is: Abbruscato, Terence 2013-05-10 19:39:55.806 hpple-test[33881:c07] title is: T-Abbruscato 2013-05-10 19:39:56.525 hpple-test[33881:c07] title is: 773-442-5339 ...
关于ios - HTML表格解析xcode,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16047090/