python - 与 Flask 并行运行 URL 请求

标签 python google-cloud-functions python-asyncio aiohttp

asyncio 对我来说仍然相对较新。

我从基础知识开始 - 简单的 HTTP hello world - 只需发出大约 40 个并行 GET 请求并使用 Flask 获取 HTTP 响应的前 400 个字符(“并行”函数由请求调用)。

它在 python 3.7 上运行。

回溯显示了我不明白的错误。这是指哪个“构造函数参数应该是 str”?我应该如何进行?

这是应用程序的完整代码:

import aiohttp
import asyncio
import json

async def fetch(session, url):
    async with session.get(url) as response:
        return await response.text()

async def main():
    global urls
    tasks = []
    async with aiohttp.ClientSession() as session:
        for url in urls:
            tasks.append(fetch(session, url))
        htmls = await asyncio.gather(*tasks)
        returnstring = ""
        for html in htmls:
            returnstring += html + ","
            print(html[:400])
        return returnstring


def parallel(request):
    global urls
    urls = []
    request_json = request.get_json()
    if request_json and 'urls' in request_json:
        urls = request_json['urls']
        print(urls)

    loop = asyncio.get_event_loop()
    return loop.run_until_complete(main())

回溯显示错误:

Traceback (most recent call last):
  File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 346, in run_http_function
    result = _function_handler.invoke_user_function(flask.request)
  File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 217, in invoke_user_function
    return call_user_function(request_or_event)
  File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 210, in call_user_function
    return self._user_function(request_or_event)
  File "/user_code/main.py", line 57, in parallel
    return loop.run_until_complete(main())
  File "/opt/python3.7/lib/python3.7/asyncio/base_events.py", line 573, in run_until_complete
    return future.result()
  File "/user_code/main.py", line 15, in main
    htmls = await asyncio.gather(*tasks)
  File "/user_code/main.py", line 6, in fetch
    async with session.get(url) as response:
  File "/env/local/lib/python3.7/site-packages/aiohttp/client.py", line 1012, in __aenter__
    self._resp = await self._coro
  File "/env/local/lib/python3.7/site-packages/aiohttp/client.py", line 380, in _request
    url = URL(str_or_url)
  File "/env/local/lib/python3.7/site-packages/yarl/__init__.py", line 149, in __new__
    raise TypeError("Constructor parameter should be str")
TypeError: Constructor parameter should be str

最佳答案

我测试过:如果我在

中使用不同的字符串(即元组/列表)
session.get( (url, something) ) 

然后我得到你的错误。所以你的网址中有错误的数据。


我用来测试它的代码:

import aiohttp
import asyncio

async def fetch(session, url):
    async with session.get(url) as response:
        return await response.text()

async def main(urls):
    tasks = []
    results = []
    async with aiohttp.ClientSession() as session:
        for url in urls:
            tasks.append(fetch(session, url))
        results = await asyncio.gather(*tasks)
    return results

def parallel(urls):
    loop = asyncio.get_event_loop()
    results = loop.run_until_complete(main(urls))
    return results

# --- main ---

urls = [
    #('https://stackoverflow.com/', 1), # TypeError: Constructor parameter should be str
    'https://stackoverflow.com/',
    'https://httpbin.org/',
    'http://toscrape.com/',
]

result = parallel(urls)

for item in result:
    print(item[:300])
    print('-----')

我不知道你得到什么request_json['urls']但你应该只得到网址

 urls = request_json['urls']
 urls = [ ??? for x in urls] # in place `???` use code which get only url from `x`

关于python - 与 Flask 并行运行 URL 请求,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59439708/

相关文章:

python - 如何使用 python 中的枚举将值放入列表中?

python - 如何通过 Python 使用 Android 'IP Webcam' 应用程序

python - 多处理 : optimize CPU usage for concurrent HTTP async requests

python - 设置 PySpark

python - 在 tensorflow 中对两个矩阵进行卷积的最佳方法是什么?

javascript - 从谷歌存储桶读取图像并使用锐利转换失败

Firebase Cloud Function 部署 tslint 错误

node.js - Firebase 的云功能 - getaddrinfo ENOTFOUND

python - Windows 上的 asyncio create_subprocess_shell python 3.7

python - Asyncio 在多个 future 到达时生成结果