python - Flask 中的代码 503 与嵌入式 Bokeh 服务器应用程序通过 requests.get() 获取 json 化数据

标签 python flask python-requests bokeh

我正在参数化我的 Bokeh 应用程序,方法是让我的 Flask 应用程序通过专用于对通过查询字符串参数传递的请求数据进行 json 化的路由公开模型数据。我知道数据发送路由有效,因为当我将其用作 AjaxDataSource 的 url 时,我得到了绘制的预期数据。然而,当我尝试使用 requests.get api 进行等效操作时,我得到了 503 响应代码,这让我觉得我违反了一些基本的东西,但凭借我有限的 webdev 经验,我无法完全理解。我做错了什么或违反了什么?

实际上,我需要比 AjaxDataSource 提供的列限制更多的数据检索灵 active 。我希望依靠 requests 模块来传递任意类实例以及通过序列化和反序列化 Json 来传递的内容。

这是我展示来自 flask_embed.html 的失败的最小示例。 ...

import requests
from flask import Flask, jsonify, render_template
import pandas
from tornado.ioloop import IOLoop

from bokeh.application          import Application
from bokeh.application.handlers import FunctionHandler
from bokeh.embed                import server_document
from bokeh.layouts              import column
from bokeh.models               import AjaxDataSource,ColumnDataSource
from bokeh.plotting             import figure
from bokeh.server.server        import Server

flask_app = Flask(__name__)

# Populate some model maintained by the flask application
modelDf = pandas.DataFrame()
nData = 100
modelDf[ 'c1_x' ] = range(nData)
modelDf[ 'c1_y' ] = [ x*x for x in range(nData) ]
modelDf[ 'c2_x' ] = range(nData)
modelDf[ 'c2_y' ] = [ 2*x for x in range(nData) ]

def modify_doc1(doc):
    # get colum name from query string
    args      = doc.session_context.request.arguments
    paramName = str( args['colName'][0].decode('utf-8') )

    # get model data from Flask
    url    = "http://localhost:8080/sendModelData/%s" % paramName 
    source = AjaxDataSource( data             = dict( x=[] , y=[] ) ,
                            data_url         = url       ,
                            polling_interval = 5000      ,
                            mode             = 'replace' ,
                            method           = 'GET'     )
    # plot the model data
    plot = figure( )
    plot.circle( 'x' , 'y' , source=source , size=2 )
    doc.add_root(column(plot))

def modify_doc2(doc):
    # get column name from query string
    args    = doc.session_context.request.arguments
    colName = str( args['colName'][0].decode('utf-8') )

    # get model data from Flask
    url = "http://localhost:8080/sendModelData/%s" % colName
    #pdb.set_trace()
    res = requests.get( url , timeout=None , verify=False )
    print( "CODE %s" % res.status_code )
    print( "ENCODING %s" % res.encoding )
    print( "TEXT %s" % res.text )
    data = res.json()

    # plot the model data
    plot = figure()
    plot.circle( 'x' , 'y' , source=data , size=2 )
    doc.add_root(column(plot))


bokeh_app1 = Application(FunctionHandler(modify_doc1))
bokeh_app2 = Application(FunctionHandler(modify_doc2))

io_loop = IOLoop.current()

server = Server({'/bkapp1': bokeh_app1 , '/bkapp2' : bokeh_app2 }, io_loop=io_loop, allow_websocket_origin=["localhost:8080"])
server.start()

@flask_app.route('/', methods=['GET'] )
def index():
    res =  "<table>"
    res += "<tr><td><a href=\"http://localhost:8080/app1/c1\">APP1 C1</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/app1/c2\">APP1 C2</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/app2/c1\">APP2 C1</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/app2/c2\">APP2 C2</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/sendModelData/c1\">DATA C1</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/sendModelData/c2\">DATA C2</a></td></tr>"
    res += "</table>"
    return res

@flask_app.route( '/app1/<colName>' , methods=['GET'] )
def bkapp1_page( colName ) :
    script = server_document( url='http://localhost:5006/bkapp1' , arguments={'colName' : colName } )
    return render_template("embed.html", script=script)

@flask_app.route( '/app2/<colName>' , methods=['GET'] )
def bkapp2_page( colName ) :
    script = server_document( url='http://localhost:5006/bkapp2', arguments={'colName' : colName } )
    return render_template("embed.html", script=script)

@flask_app.route('/sendModelData/<colName>' , methods=['GET'] )
def sendModelData( colName ) :
    x = modelDf[ colName + "_x" ].tolist()
    y = modelDf[ colName + "_y" ].tolist()
    return jsonify( x=x , y=y )

if __name__ == '__main__':
    from tornado.httpserver import HTTPServer
    from tornado.wsgi import WSGIContainer
    from bokeh.util.browser import view

    print('Opening Flask app with embedded Bokeh application on http://localhost:8080/')

    # This uses Tornado to server the WSGI app that flask provides. Presumably the IOLoop
    # could also be started in a thread, and Flask could server its own app directly
    http_server = HTTPServer(WSGIContainer(flask_app))
    http_server.listen(8080)

    io_loop.add_callback(view, "http://localhost:8080/")
    io_loop.start()

这是呈现的页面... Comparison of working vs not working Flask Model json retrieval

这是一些调试输出...

C:\TestApp>python flask_embedJSONRoute.py
Opening Flask app with embedded Bokeh application on http://localhost:8080/
> C:\TestApp\flask_embedjsonroute.py(52)modify_doc2()
-> res = requests.get( url , timeout=None , verify=False )
(Pdb) n
> C:\TestApp\flask_embedjsonroute.py(53)modify_doc2()
-> print( "CODE %s" % res.status_code )
(Pdb) n
CODE 503
> C:\TestApp\flask_embedjsonroute.py(54)modify_doc2()
-> print( "ENCODING %s" % res.encoding )
(Pdb) n
ENCODING utf-8
> C:\TestApp\flask_embedjsonroute.py(55)modify_doc2()
-> print( "TEXT %s" % res.text )
(Pdb) n
TEXT
> C:\TestApp\flask_embedjsonroute.py(56)modify_doc2()
-> data = res.json()
(Pdb)

  File "C:\Anaconda3\lib\json\decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

最佳答案

这似乎不是 Bokeh 本身的问题,而是运行 Flask 应用程序的服务器中的线程和阻塞问题。

除了 Bokeh 之外,它完全可以重现......

import requests
from flask import Flask, jsonify, request
import pandas
import pdb

flask_app = Flask(__name__)

# Populate some model maintained by the flask application
modelDf = pandas.DataFrame()
nData = 100
modelDf[ 'c1_x' ] = range(nData)
modelDf[ 'c1_y' ] = [ x*x for x in range(nData) ]
modelDf[ 'c2_x' ] = range(nData)
modelDf[ 'c2_y' ] = [ 2*x for x in range(nData) ]

@flask_app.route('/', methods=['GET'] )
def index():
    res =  "<table>"
    res += "<tr><td><a href=\"http://localhost:8080/sendModelData/c1\">SEND C1</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/sendModelData/c2\">SEND C2</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/RequestsOverFlaskNoProxy?colName=c1\">REQUEST OVER FLASK NO PROXY C1</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/RequestsOverFlaskNoProxy?colName=c2\">REQUEST OVER FLASK NO PROXY C2</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/RequestsOverFlask?colName=c1\">REQUEST OVER FLASK C1</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/RequestsOverFlask?colName=c2\">REQUEST OVER FLASK C2</a></td></tr>"
    res += "</table>"   
    return res

@flask_app.route('/RequestsOverFlaskNoProxy')
def requestsOverFlaskNoProxy() :
    print("RequestsOverFlaskNoProxy")
    # get column name from query string
    colName = request.args.get('colName')

    # get model data from Flask
    url = "http://localhost:8080/sendModelData/%s" % colName

    print("Get data from %s" % url )
    session = requests.Session()
    session.trust_env = False
    res = session.get( url , timeout=5000 , verify=False )
    print( "CODE %s" % res.status_code )
    print( "ENCODING %s" % res.encoding )
    print( "TEXT %s" % res.text )
    data = res.json()
    return data

@flask_app.route('/RequestsOverFlask')
def requestsOverFlask() :
    # get column name from query string
    colName = request.args.get('colName')

    # get model data from Flask
    url = "http://localhost:8080/sendModelData/%s" % colName
    res = requests.get( url , timeout=None , verify=False )
    print( "CODE %s" % res.status_code )
    print( "ENCODING %s" % res.encoding )
    print( "TEXT %s" % res.text )
    data = res.json()
    return data

@flask_app.route('/sendModelData/<colName>' , methods=['GET'] )
def sendModelData( colName ) :
    x = modelDf[ colName + "_x" ].tolist()
    y = modelDf[ colName + "_y" ].tolist()
    return jsonify( x=x , y=y )

if __name__ == '__main__':
    print('Opening Flask app on http://localhost:8080/')

    # THIS DOES NOT WORK
    #flask_app.run( host='0.0.0.0' , port=8080 , debug=True )

    # THIS WORKS
    flask_app.run( host='0.0.0.0' , port=8080 , debug=True , threaded=True ) 

Different behavior serving the same data

从屏幕截图中可以看出,直接从 sendModelData 提供数据会适本地呈现 JSon,但是当通过 requests.get 方法获取时,会由于以下原因而产生异常: Python 控制台中报告的 503 代码。

如果我做同样的尝试试图消除 proxies 的影响我已经通过环境变量启用了它,但这种方法永远不会完成,并且请求使浏览器无限期地旋转。

想一想,甚至可能完全没有必要使用 requests 作为中间人,我应该能够获取 json 字符串并自己进行反序列化。好吧,这在我的实际代码中是可行的,Bokeh 渲染是在与 Flask 应用程序完全不同的 python 模块中完成的,因此这些功能甚至不可用,除非我打乱应用程序的分层。

编辑 事实证明我违反的根本是 Flask 的开发环境......

You are running your WSGI app with the Flask test server, which by default uses a single thread to handle requests. So when your one request thread tries to call back into the same server, it is still busy trying to handle that one request. https://stackoverflow.com/a/22878916/1330381

那么问题就变成了如何在原始 Bokeh 示例中应用这种 threaded=True 技术?由于flask_embed.py示例依赖于Tornado WSGI服务器,这可能是不可能的,从这个question表明 Tornado 在设计上是单线程的。 鉴于上述发现,一个更尖锐的问题是,AjaxDataSource 如何一起避免 requests 模块面临的这些线程问题?

<小时/>

更新 有关 Bokeh 和 Tornado 耦合的更多背景信息...

53:05 so they're actually are not very many, the question is about the dependencies for Bokeh and the Bokeh server. The new Bokeh server is built on tornado and that's pretty much the main dependency is it uses tornado. Beyond that there's not very many dependencies, runtime dependencies, for Bokeh. pandas is an optional dependency for Bokeh.charts. There's other dependencies, you know numpy is used. But there's only, the list of dependencies I think is six or seven. We've tried to pare it down greatly over the years and so, but the main dependency of the server is tornado. Intro to Data Visualization with Bokeh - Part 1 - Strata Hadoop San Jose 2016

关于python - Flask 中的代码 503 与嵌入式 Bokeh 服务器应用程序通过 requests.get() 获取 json 化数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44401185/

相关文章:

python - 根据字典中的值创建 .csv

python - 如何在 PySimpleGUI GUI 中接收数据?

python - Flask-Migrate:alembic.util.CommandError:没有这样的修订

python-2.7 - 如何提取与模式匹配的 URL

python slice() 函数 vs slice notation - 如何处理 slice() 中的空值?

python - scipy.fft 挂起某些声音文件

使用 Bandit 进行安全分析的 Python 代码

python - 无法使用 docker-compose 连接到 MongoDB

Python 网络抓取 - 实时数据

Python:如何在不保存文件的情况下处理来自 Web 的 excel 数据