html - Objective C 使用 NSScanner 从 html 中获取 <body>

我正在尝试创建一个 iOS 应用程序来提取网页部分。

我有连接到 URL 并将 HTML 存储在 NSString 中的代码

我已经试过了，但我的结果只是得到空字符串

    NSScanner* newScanner = [NSScanner scannerWithString:htmlData];
    // Create a new scanner and give it the html data to parse.

    while (![newScanner isAtEnd])
    {
        [newScanner scanUpToString:@"<body>" intoString:NULL];
        // Scam until <body> tag is found

        [newScanner scanUpToString:@"</body>" intoString:&bodyText];
        // Everything up to the end tag will get placed into the memory address of the result string

    }

我试过另一种方法...

    NSScanner* newScanner = [NSScanner scannerWithString:htmlData];
    // Create a new scanner and give it the html data to parse.

    while (![newScanner isAtEnd])
    {
        [newScanner scanUpToString:@"<body" intoString:NULL];
        // Scam until <body> tag is found

        [newScanner scanUpToString:@">" intoString:NULL];
        // Go to end of opening <body> tag

        [newScanner scanUpToString:@"</body>" intoString:&bodyText];
        // Everything up to the end tag will get placed into the memory address of the result string

    }

第二种方式返回一个以>< script...开头的字符串等等

老实说，我没有一个好的 URL 来测试它，我认为在删除正文中的标签方面有一些帮助可能会更容易(比如 <p></p>)

非常感谢任何帮助

最佳答案

我不知道为什么你的第一种方法不起作用。我假设您在该片段之前定义了 bodyText。这段代码对我来说很好用，

- (void)viewDidLoad {
    [super viewDidLoad];
    NSString *htmlData = @"This is some stuff before <body> this is the body </body> with some more stuff";
    NSScanner* newScanner = [NSScanner scannerWithString:htmlData];
    NSString *bodyText;
    while (![newScanner isAtEnd]) {
        [newScanner scanUpToString:@"<body>" intoString:NULL];
        [newScanner scanString:@"<body>" intoString:NULL];
        [newScanner scanUpToString:@"</body>" intoString:&bodyText];
    }
    NSLog(@"%@",bodyText); // 2015-01-28 15:58:00.360 ScanningOfHTMLProblem[1373:661934] this is the body 
}

请注意，我添加了对 scanString:intoString: 的调用通过第一个"<body>" .

关于html - Objective C 使用 NSScanner 从 html 中获取 <body>，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/28204380/

html - Objective C 使用 NSScanner 从 html 中获取 <body>

上一篇：ios - 编码的 nsdata utf8 json，在 ios 中带有重音字符

下一篇：ios - 逆向工程 iOS 天气应用 UI 组件