python - HTML到文本，例如Python的BeautifulSoup

我有一个python程序，输出如下：

from bs4 import BeautifulSoup

html = `<h1>This is heading</h1> <p>this is parah <strong>strong</strong> that\'s how it works</p>`

parsed_html = BeautifulSoup(html, 'html.parser')
all_lines = parsed_html.findAll(text=True)
print(all_lines)

# ['This is heading', ' ', 'this is parah ', 'strong', " that's how it works"]

我试图在果朗实现同样的目标，但无法获得所需的产出。到目前为止我所做的：

import (
    "fmt"
    "strings"
    "github.com/PuerkitoBio/goquery"
)

func parseHTML(body string) string {

    p := strings.NewReader(body)
    doc, _ := goquery.NewDocumentFromReader(p)

    fmt.Println(doc.Text()) 

    // output: This is heading this is parah strong thats how it works

}

最佳答案

如果你能自己实现一个函数，看起来很简单。
只需删除所有标记“…”标记，并继续添加“…”
这将提供与python输出完全相同的输出。

关于python - HTML到文本，例如Python的BeautifulSoup，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56475788/

上一篇：mongodb - 使用 mongo-go-driver 和接口(interface)将游标反序列化为数组

下一篇：mongodb - Mongo DB 结果接口(interface)到 Golang 中的结构转换

相关文章：

python - 如何加快 aiohttp 解析器 bs4 的速度？

python - 用于 Linux 的 Python 中的垃圾收集

python - 根据字典中的值创建 .csv

go - 如何在此示例中使用 Go 例程？

go - 直接调用函数和使用指针之间的不同行为

go - 如何在Go中向现有类型添加新方法？

python - 尝试使用 BeautifulSoup 从我的代码中使用 Xpath 进行网页抓取

python - 使用 BeautifulSoup 分析和编辑 html 代码中的链接

python - pandas - 带倒计时的批处理分配

python - 如何从数据列表制作直方图并使用 matplotlib 绘制它