python - SQLAlchemy:使用 ORM 扫描大表？

我目前正在玩一些 SQLAlchemy，这真的很整洁。

为了测试，我创建了一个包含我的图片存档的巨大表，由 SHA1 哈希索引(以删除重复 :-))。这是令人印象深刻的快...

为了好玩，我对生成的 SQLite 数据库做了相当于 select * 的操作:

session = Session()
for p in session.query(Picture):
    print(p)

我希望看到哈希滚动，但它只是继续扫描磁盘。与此同时，内存使用量猛增，几秒钟后就达到了 1GB。这似乎来自 SQLAlchemy 的身份映射功能，我认为它只是保留弱引用。

谁能给我解释一下？我以为每张图片p都会在写出hash后被收集!？

最佳答案

好的，我自己找到了一种方法。将代码更改为

session = Session()
for p in session.query(Picture).yield_per(5):
    print(p)

一次只加载 5 张图片。默认情况下，查询似乎一次加载所有行。但是，我还不明白该方法的免责声明。引自 SQLAlchemy docs

WARNING: use this method with caution; if the same instance is present in more than one batch of rows, end-user changes to attributes will be overwritten. In particular, it’s usually impossible to use this setting with eagerly loaded collections (i.e. any lazy=False) since those collections will be cleared for a new load when encountered in a subsequent result batch.

如果使用 yield_per 实际上是正确的方法 (tm) 在使用 ORM 时扫描大量 SQL 数据，那么何时使用它是安全的？

关于python - SQLAlchemy:使用 ORM 扫描大表？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/1145905/

python - SQLAlchemy:使用 ORM 扫描大表？

上一篇：python - matplotlib 中的文本框

下一篇：python - Python中的集合和列表有什么区别？