python - 如何使用 BeautifulSoup 寻找值(value)

标签 python html beautifulsoup

我需要像这样从标记中获取人名(这里是 Alex Key):

<div class="link_container">
<a class="follow_card" data-uuid="e47443373cfa93d5341ab809f0700b82" 
data-type="person" data-name="Alex Key" data-permalink="/person/alex-acree-2" 
data-image="" data-follower-count="0" href="/person/alex-key-2">Alex Key</a></div>

我试试代码:

from django.shortcuts import render
import time
from bs4 import BeautifulSoup
from selenium import webdriver
from django.http import HttpResponse
import os

def impTxt(request):
abs_path = os.path.dirname(__file__) # i.e. /path/to/dir/
root_dir = os.path.split(abs_path)[0] #i.e. /path/to/root_of_project/
imp_file_path = "files/links.txt"
abs_imp_file_path = os.path.join(root_dir, imp_file_path) # abs_path to file


with open(abs_imp_file_path, 'r') as inputfile:
    imp_txt = []
    # print imp_txt
    for line in inputfile:
        imp_txt.append(str(line).strip('[]'))
        print line
        # print imp_txt
    for link in imp_txt:
        # print link
        driver = webdriver.Chrome('/Volumes/Storage/downloads_storage/chromedriver')
        driver.get(link)
        driver.set_window_position(0, 0)
        driver.set_window_size(100000, 200000)
        driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
        time.sleep(1) 
        soup = BeautifulSoup(driver.page_source, "lxml")
        text = soup.find('a',{'class': 'follow_card'}).getText()
        print text
        # content = {
        # 'text':text,
        # }
        return render(request, "web/parser.html",{})

但没有得到。请指出一种在标签内查找变量的方法。

UPDATED: added full code

最佳答案

getText() 方法可以为您做:

text = soup.find('a',{'class':'follow_card'}).getText()

这里的类名是follow_card

关于python - 如何使用 BeautifulSoup 寻找值(value),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36792179/

相关文章:

python - BeautifulSoup 查找有条件的结果

python - Pandas - 如何在数据帧的每组中执行值与时间的 OLS 回归?

html - 无法使媒体查询正常工作。

python - 无法解析余数 : '{{ list[loop.index0] }}'

python - BeautifulSoup 元素输出到列表

python - 当我使用 findAll 时,BeautifulSoup 总是返回 null

python - Flask:在数据库中插入、删除和更新后发布数据

Python字典没有按顺序排列

python - 如何在Flask中处理socket.io断开的连接?

html - 属性状态 : Deprecated or Obsolete?