我有一个这样的代码,我试图在 h1 中获取数据。这里是“The Wire”。但是我在 h1 中获取了所有文本。
<h1 id="aiv-content-title" class="js-hide-on-play">
The Wire
<span class="num-of-seasons">5 Seasons</span>
<span class="release-year">2002</span>
</h1>
我得到的输出是Wire5 Seasons2002
heading=elm.find('h1',id='aiv-content-title')
print heading
seasons=elm.find('span',{'class':'num-of-seasons'})
if seasons=='None':
print '1'
elif seasons!='None':
print seasons.text
release_year=elm.find('span',{'class':'release-year'})
print release_year.text
print
当我尝试这段代码时,我得到了这种方式
The Wire5 Seasons2002
5个季节
2002
我期待这样的事情
电线
5个季节
2002
最佳答案
您可以执行以下操作:
h1_element = elm.find('h1',{id:'aiv-content-title'})
num_seasons = h1_element.find('span',{'class':'num-of-seasons'}).getText().strip()
release_year = h1_element.find('span',{'class':'release-year'}).getText().strip()
while h1_element.find('span'):
h1_element.find('span').extract()
# This will remove the span elements in the h1 element
print h1_element.getText().strip()
print num_seasons
print release_year
关于python - 在python中只获取h1文本没有跨度文本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24301605/