我有以下代码:
f = open(path, 'r')
html = f.read() # no parameters => reads to eof and returns string
soup = BeautifulSoup(html)
schoolname = soup.findAll(attrs={'id':'ctl00_ContentPlaceHolder1_SchoolProfileUserControl_SchoolHeaderLabel'})
print schoolname
给出:
[<span id="ctl00_ContentPlaceHolder1_SchoolProfileUserControl_SchoolHeaderLabel">A B Paterson College, Arundel, QLD</span>]
当我尝试使用 schoolname['value']
访问该值(即“A B Paterson College, Arundel, QLD”)时,出现以下错误:
print schoolname['value'] TypeError: list indices must be integers, not str
为了获得这个值我做错了什么?
最佳答案
您可以使用contents
沿着树向下移动:
>>> for x in schoolname:
>>> print x.contents
[u'A B Paterson College, Arundel, QLD']
请注意,内容不一定是字符串 - 一般来说,它也可以是更多标签或字符串和标签的混合。
关于python - 在 Beautifulsoup 中提取值(value),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/2616659/