python - DataFrames 的选择性重新内存

假设我用 Joblib 设置了内存如下(使用提供的解决方案 here ):

from tempfile import mkdtemp
cachedir = mkdtemp()

from joblib import Memory
memory = Memory(cachedir=cachedir, verbose=0)

@memory.cache
def run_my_query(my_query)
    ...
    return df

假设我定义了几个查询，query_1 和 query_2，它们都需要很长时间才能运行。

我明白了，代码是这样的:

second 调用任一查询，将使用内存输出，即:

run_my_query(query_1)
run_my_query(query_1) # <- Uses cached output

run_my_query(query_2)
run_my_query(query_2) # <- Uses cached output

我可以使用 memory.clear() 删除整个缓存目录

但是，如果我想重做内存只有一个查询(例如query_2)而不强制删除另一个查询？

最佳答案

库似乎不支持缓存的部分删除。

您可以将缓存、函数分成两对:

from tempfile import mkdtemp
from joblib import Memory

memory1 = Memory(cachedir=mkdtemp(), verbose=0)
memory2 = Memory(cachedir=mkdtemp(), verbose=0)

@memory1.cache
def run_my_query1()
    # run query_1
    return df

@memory2.cache
def run_my_query2()
    # run query_2
    return df

现在，您可以有选择地清除缓存:

memory2.clear()

在看到 behzad.nouri 的评论后

更新:

您可以使用装饰函数的call 方法。但是正如您在下面的示例中所看到的，返回值与正常调用不同。你应该照顾好它。

>>> import tempfile
>>> import joblib
>>> memory = joblib.Memory(cachedir=tempfile.mkdtemp(), verbose=0)
>>> @memory.cache
... def run(x):
...     print('called with {}'.format(x))  # for debug
...     return x
...
>>> run(1)
called with 1
1
>>> run(2)
called with 2
2
>>> run(3)
called with 3
3
>>> run(2)  # Cached
2
>>> run.call(2)  # Force call of the original function
called with 2
(2, {'duration': 0.0011069774627685547, 'input_args': {'x': '2'}})

关于python - DataFrames 的选择性重新内存，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/25998334/

python - DataFrames 的选择性重新内存

上一篇：python - 将 OAuth 与 Google App Engine 结合使用的用户信息

下一篇：python - 二维图的 1sigma 置信区域