python - 如何使用 BeautifulSoup4 从 Python 网站获取经常更新的 .php 文本？

我想创建一个自动脚本来从经常更新的网页下载 .php 文本文件。我的程序使用请求来获取网页。
编码:

import os, pathlib, subprocess,requests, time, sys



url = 'http://metar.vatsim.net/metar.php?id=all'

current_dir = pathlib.Path(__file__).parent
os.chdir(current_dir)




icao = sys.argv[1]
fp = requests.get(url)
mybytes = fp.read()

mystr = mybytes.decode("utf8")
fp.close()

dict = {}

fls = str.splitlines(mystr)
for x in range(len(fls)):
    cur = str.split(fls[x])
    dict[cur[0]] = " ".join(cur)
    
try:
    print(dict[icao])
except:
    print('INCORRECT FORMAT OR AIRPORT ID\n')

当我尝试读取 fp 时，它显示错误:

mybytes = fp.read()
AttributeError: 'Response' object has no attribute 'read'

有没有更好的方法来解决这个问题，我有点卡住了。

最佳答案

您要找的是urllib.request ，不是 requests .
也许这会奏效:

import urllib.request

fp = urllib.request.urlopen(url)
mybytes = fp.read()

mystr = mybytes.decode("utf8")
fp.close()

这将读取 http://metar.vatsim.net/metar.php?id=all 中的文本.

关于python - 如何使用 BeautifulSoup4 从 Python 网站获取经常更新的 .php 文本？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/67326926/

上一篇：reactjs - 在 Nextjs 应用程序中使用 Cloud Run 环境变量

下一篇：r - 另存为_kable_extra(格式 = "latex")在文件中

python - 将 scikit-learn 详细日志写入外部文件

python - Pandas:转换为数字，必要时创建 NaN

python - 使用 web python 下载不同语言的网页

python - 在 BeautifulSoup 中替换文本而不转义

python - 如何在python中使用BeautifulSoup在没有类名的范围内提取文本

使用带有脚本的 setuptools 的 Python 相对导入

Python Teradata 自动递增以 6 位数字而不是 1 开头？

python - BeautifulSoup 的编码问题

python - 使用 python beautiful soup 进行网络抓取的空值