HTML - 查找给定标签中的所有子标签

假设我有一个包含类似内容的 html 页面

<ul class ="good">
    <li>1</li>
    <li>2</li>
    <li>3</li>
</ul>

<ul class ="bad">
    <li>a</li>
    <li>b</li>
    <li>c</li>
</ul>

我想捕获 <li>第一个里面的元素 <ul> .来自 here我基本上复制了(注意:根据@twotwotwo 评论编辑代码)

page, _ := html.Parse(httpBody)
    var f func(*html.Node)
    f = func(n *html.Node) {
        //fmt.Println("Inside f")
        if n.Type == html.ElementNode && n.Data == "ul" {
            fmt.Println("ul found ->  ",n)
            for c := n.FirstChild; c != nil; c = c.NextSibling {
                f(c)
            }
        } else {
          fmt.Println(n.Data ,"is not the correct one")
          for c := n.FirstChild; c != nil; c = c.NextSibling { f(c) }
          }
    }
f(page)

但我获得的唯一输出是

 is not the correct one
html is not the correct one
head is not the correct one
body is not the correct one

我想知道为什么递归在 body 处停止。我试过 motherfuckingwebsite.com体内有标签

附言我也试过了

page := html.NewTokenizer(httpBody)

for {
    tokenType := page.Next()
    if tokenType == html.ErrorToken {
        return links
    }
    token := page.Token()

但这似乎显示了所有标记，而不关心树结构。

编辑:

最佳答案

我过去用过这个包:https://github.com/PuerkitoBio/goquery

它提供了一个“类似 jQuery”的界面/跨 HTML 文档的查询。使用该库，就这么简单:

import (
    "bytes"
    "fmt"
    "log"

    "github.com/PuerkitoBio/goquery"
)

var httpBody string = `
    <ul class ="good">
        <li>1</li>
        <li>2</li>
        <li>3</li>
    </ul>

    <ul class ="bad">
        <li>a</li>
        <li>b</li>
        <li>c</li>
    </ul>
`

func main() {
    b := bytes.NewBufferString(httpBody)
    doc, err := goquery.NewDocumentFromReader(b)
    if err != nil {
        log.Fatal(err)
    }

    doc.Find("ul.good").Each(func(i int, ul *goquery.Selection) {
        ul.Find("li").Each(func(i int, li *goquery.Selection) {
            fmt.Println(li.Text())
        })
    })
}

打印:

1
2
3

关于HTML - 查找给定标签中的所有子标签，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/26133381/

HTML - 查找给定标签中的所有子标签

上一篇：c - 为什么这个简单的循环在 Go 中比在 C 中更快？

下一篇：function - Go - 将数组传递给接收参数列表的函数