linux - 通过 Flask 内存泄漏的 Tensorflow Inception 模型

标签 linux flask memory-leaks tensorflow

下面是我用来通过 Flask 为 Inception-Model 提供服务的代码。但不幸的是,Linux 会在后台杀死该程序以占用内存。

从内核日志中,我可以发现 server.py python 程序被 Linux OOM-Killer 杀死,因为内核由于可用内存不足而无法满足其他程序请求的内存,因此通过选择杀死 python 过程。

请查看进程消耗的内存跟踪 (total_vm)。它接近 1.5GB 到 1.7GB,这对我来说似乎很高。

[ pid ]   uid  tgid               total_vm     rss                  nr_ptes swapents            oom_score_adj                name
[ 8640]     0  8640             1654607  1436423          3080      35564                  0                                           python
[32139]    0 32139           1712754  1495071          3195      34153                  0                                           python
[25121]    0 25121           1586597  1390072          2943     9795                    0                                           python

Jun  8 19:15:32 incfs1002 kernel: [16448663.210440] Out of memory: Kill process 8640 (python) score 565 or sacrifice child
Jun  8 19:15:32 incfs1002 kernel: [16448663.211941] Killed process 8640 (python) total-vm:6618428kB, anon-rss:5745664kB, file-rss:28kB

Jun  8 18:21:16 incfs1002 kernel: [16445405.714834] Out of memory: Kill process 32139 (python) score 587 or sacrifice child
Jun  8 18:21:16 incfs1002 kernel: [16445405.714878] Killed process 32139 (python) total-vm:6851016kB, anon-rss:5980284kB, file-rss:0kB

Jun  7 17:40:55 incfs1002 kernel: [16356536.627117] Out of memory: Kill process 25121 (python) score 537 or sacrifice child
Jun  7 17:40:55 incfs1002 kernel: [16356536.627157] Killed process 25121 (python) total-vm:6346388kB, anon-rss:5560164kB, file-rss:124kB

代码:

import os
from flask import Flask, request, jsonify
from flask_cors import CORS, cross_origin
import tensorflow as tf

ALLOWED_EXTENSIONS = set(['jpg', 'jpeg'])

app = Flask(__name__)
CORS(app)
app.config['UPLOAD_FOLDER'] = 'uploads'


def allowed_file(filename):
    return filename[-3:].lower() in ALLOWED_EXTENSIONS


@app.route('/classify', methods=['GET'])
@cross_origin()
def classify_image():
    result = {}
    filename = request.args.get('file')
# Check if filename matches

if filename:
    image_path = os.path.join(app.config['UPLOAD_FOLDER'], filename)
    image_data = tf.gfile.FastGFile(image_path, 'rb').read()

    label_lines = [line.strip() for line in tf.gfile.GFile("output_labels.txt")]

    with tf.gfile.FastGFile("output_graph.pb", 'rb') as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())
        _ = tf.import_graph_def(graph_def, name='')

    with tf.Session() as sess:
        # Feed the image data as input to the graph an get first prediction
        softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
        predictions = sess.run(softmax_tensor, \
                               {'DecodeJpeg/contents:0': image_data})
        # Sort to show labels of first prediction in order of confidence
        top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]

        low_confidence = 0
        for node_id in top_k:
            human_string = label_lines[node_id]
            score = predictions[0][node_id]
            # print('%s (score = %.2f)' % (human_string, score))
            if score < 0.90:
                low_confidence += 1
            result[human_string] = str(score)

        if low_confidence >= 2:
            result['error'] = 'Unable to classify document type (Passport/Driving License)'

return jsonify(result)


if __name__ == '__main__':
    app.run(debug=True)

最佳答案

我有同样的问题,但是我的代码是这样的:

class Classifer
  def __init__(self):
    self.sess = tf.Session()

  def predict(self, image):
    self.sess.run(image)

我不能说flask和caffe是否有同样的问题。

我使用 gunicorn 解决了这个问题:

gunicorn \
  --reuse-port \
  -b ${host}:${port} \
  --reload  \
  --log-level debug \
  --workers 2 \
  --max-requests 10000 \
  --max-requests-jitter 200 \

请注意,max-requests 将服务器设置为在 10000 次调用后重新启动。

关于linux - 通过 Flask 内存泄漏的 Tensorflow Inception 模型,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44444874/

相关文章:

java - 如何从 Java 运行 shell 脚本并让它在 JVM 关闭后继续运行?

javascript - Python Flask将变量传递给html文件在脚本标签下不起作用

unit-testing - Flask:重新加载时运行测试

python - 如何使用 FormFields 的 WTForms FieldList?

android - 在哪里关闭光标?

java - 如何设置特定用户运行processBuilder?

linux - 找到 native 库时出现 Java UnsatisfiedLinkError

linux - 从文件中截取一些行的聪明方法

java - Valgrind 检测 Java Web 应用程序中的内存泄漏

PreparedStatement 中的 Java 资源泄漏