我有一个python程序,输出如下:
from bs4 import BeautifulSoup
html = `<h1>This is heading</h1> <p>this is parah <strong>strong</strong> that\'s how it works</p>`
parsed_html = BeautifulSoup(html, 'html.parser')
all_lines = parsed_html.findAll(text=True)
print(all_lines)
# ['This is heading', ' ', 'this is parah ', 'strong', " that's how it works"]
我试图在果朗实现同样的目标,但无法获得所需的产出。到目前为止我所做的:
import (
"fmt"
"strings"
"github.com/PuerkitoBio/goquery"
)
func parseHTML(body string) string {
p := strings.NewReader(body)
doc, _ := goquery.NewDocumentFromReader(p)
fmt.Println(doc.Text())
// output: This is heading this is parah strong thats how it works
}
最佳答案
如果你能自己实现一个函数,看起来很简单。
只需删除所有标记“…”标记,并继续添加“…”
这将提供与python输出完全相同的输出。
关于python - HTML到文本,例如Python的BeautifulSoup,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56475788/