python - Beautifulsoup 在抓取 YouTube channel 时返回空列表

标签 python web-scraping beautifulsoup youtube

我正在尝试使用此代码来获取有关 youtube channel 的一些公共(public)信息(API 不太适合此任务)。

代码示例:

import re
import json
import requests
from bs4 import BeautifulSoup

URL = "https://www.youtube.com/c/Rozziofficial/about"
soup = BeautifulSoup(requests.get(URL).content, "html.parser")

# We locate the JSON data using a regular-expression pattern
data = re.search(r"var ytInitialData = ({.*});", str(soup)).group(1)

# Uncomment to view all the data
# print(json.dumps(data))

# This converts the JSON data to a python dictionary (dict)
json_data = json.loads(data)

# This is the info from the webpage on the right-side under "stats", it contains the data you want
stats = json_data["contents"]["twoColumnBrowseResultsRenderer"]["tabs"][5]["tabRenderer"]["content"]["sectionListRenderer"]["contents"][0]["itemSectionRenderer"]["contents"][0]["channelAboutFullMetadataRenderer"]

print("Channel Views:", stats["viewCountText"]["simpleText"])
print("Joined:", stats["joinedDateText"]["runs"][1]["text"])

预期结果(6个月前效果良好):

Joined: Jun 30, 2007

。 。 但现在得到了:

AttributeError: 'NoneType' object has no attribute 'group'

回溯显示错误发生在这一行:

data = re.search(r"var ytInitialData = ({.*});", str(soup)).group(1)

您能帮助解决此代码继续工作并返回数据的问题吗?

感谢任何帮助, 谢谢

最佳答案

您的代码运行良好

import re
import json
import requests
from bs4 import BeautifulSoup

URL = "https://www.youtube.com/c/Rozziofficial/about"
soup = BeautifulSoup(requests.get(URL).content, "html.parser")

# We locate the JSON data using a regular-expression pattern
data = re.search(r"var ytInitialData = ({.*});", str(soup)).group(1)

# Uncomment to view all the data
# print(json.dumps(data))

# This converts the JSON data to a python dictionary (dict)
json_data = json.loads(data)

# This is the info from the webpage on the right-side under "stats", it contains the data you want
stats = json_data["contents"]["twoColumnBrowseResultsRenderer"]["tabs"][5]["tabRenderer"]["content"]["sectionListRenderer"]["contents"][0]["itemSectionRenderer"]["contents"][0]["channelAboutFullMetadataRenderer"]

print("Channel Views:", stats["viewCountText"]["simpleText"])
print("Joined:", stats["joinedDateText"]["runs"][1]["text"])

输出:

Channel Views: 1,12,94,125টি ভিউ
Joined: 30 জুন, 2007

关于python - Beautifulsoup 在抓取 YouTube channel 时返回空列表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/71593817/

相关文章:

python - 我不确定如何从 HTML 打印我需要的其余信息

java - 在 Java 中检索数据

html - 如何根据 BeautifulSoup 的特定链接抓取文本?

python - 网页抓取 - 进入第 2 页

找不到 Python 模块错误 "No module named ' ortools'“

python - scikit kmeans 不准确的成本\惯性

python - 如何用Python(Sympy)实现一个函数,实现与Wolfram Mathematica中的ToExpression相同的功能?

python - 如何使用 beautifulsoup 从嵌套表中获取值

python - 如何从网络链接列表中检索 URL 和 URL 中的数据

定义类时的Python NameError