python - 如何用python从网页复制信息

所以我正在尝试为同义词库制作一个Python脚本。我是一名学生，将用它来写论文等，以节省更改单词的时间。到目前为止，我已经能够使用我想要的搜索词打开 thesaurus.com，但我似乎不知道如何复制前 5 个返回的单词并将它们放入列表中然后打印出来。

此时，我已经检查了 youtube 和 google。我也尝试过在 stackoverflow 上搜索，但没什么帮助，所以我请求帮助。这就是我的代码:

import webbrowser as wb
import antigravity

word = str(input()).lower()
returned_words_list = []
url = 'https://www.thesaurus.com/browse/{}'.format(word)

wb.open(url, new=2)

此时我只想将 returned_words_list 打印到控制台。到目前为止，我什至无法让它自动从网站获取文字。

最佳答案

查看网络流量，页面会向不同的 url 发出请求，并返回结果。您可以使用该端点以及几个 header 来获取 json 格式的所有结果。然后，查看this @Martijn Pieters 的回答(+ 给他)，如果您使用生成器，您可以使用 itertools 中的 islice 限制迭代。当然，您也可以从列表理解中切出全部内容。结果按相似度的降序返回，这在您获得相似度分数最高的单词时特别有用。

<小时/>

发电机

import requests
from itertools import islice

headers = {'Referer':'https://www.thesaurus.com/browse/word','User-Agent' : 'Mozilla/5.0'}
word = str(input()).lower()
r = requests.get('https://tuna.thesaurus.com/relatedWords/{}?limit=6'.format(word), headers = headers).json()

if r['data']:
    synonyms = list(islice((i['term'] for i in r['data'][0]['synonyms']), 5))
    print(synonyms)
else:
    print('No synonyms found')

<小时/>

列表理解

import requests

headers = {'Referer':'https://www.thesaurus.com/browse/word','User-Agent' : 'Mozilla/5.0'}
word = str(input()).lower()
r = requests.get('https://tuna.thesaurus.com/relatedWords/{}?limit=6'.format(word), headers = headers).json()
if r['data']:
    synonyms = [i['term'] for i in r['data'][0]['synonyms']][:5]
    print(synonyms)
else:
    print('No synonyms found')

关于python - 如何用python从网页复制信息，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57238440/

python - 如何用python从网页复制信息

上一篇：python - 如何从字典中打印具有特定键的列表

下一篇：python - Google 应用程序引擎在使用 python 下载时重命名我的文件