sql - 多字符串匹配性能

我有一个包含超过 100,000 条记录的艺术家表，我用它来匹配用户提交的艺术家数组(1 到数千)。我当前的查询如下所示:

SELECT id from artists WHERE lower(name) IN(downcase_artists)

这可以很好地完成工作，但我想知道是否可以做得更快。当匹配数千名艺术家时，查询时间从几百毫秒到有时 10 整秒不等。名称列已建立索引。 (这对字符串列有影响吗？)

我在想也许像 Redis 这样的东西会加快速度？通过保存艺术家姓名及其相应 id 的键值存储？

我还缺少其他任何可以加快速度的选项吗？

编辑:正如 James 建议的那样，我尝试实现某种 all_artists 缓存方法(使用 heroku 上的 memcache 插件)并使用它来匹配我的字符串:

artist_ids = self.all_cached.select{|a| downcase_array.include?(a.name)}.collect(&:id)

我获得了非常小的数据库查询时间，但总请求时间急剧增加:

Before: Completed 200 OK in 1853ms (Views: 164.2ms | ActiveRecord: 1476.3ms)  
After: Completed 200 OK in 12262ms (Views: 169.2ms | ActiveRecord: 1200.6ms)

当我在本地运行它时，我得到了类似的结果:

Before: Completed 200 OK in 405ms (Views: 75.6ms | ActiveRecord: 135.4ms)
After: Completed 200 OK in 3205ms (Views: 245.1ms | ActiveRecord: 126.5ms)

把 ActiveRecord 时间放在一边，看起来从查询中删除匹配的字符串会使我的问题变得更糟(而且只有 100 个字符串)。或者我错过了什么？

我还查看了全文搜索引擎，例如 Sphinx，但它们听起来确实有些过分，因为我只搜索 1 个单列...

编辑 2:我终于设法将请求时间减少到

Before: Completed 200 OK in 1853ms (Views: 164.2ms | ActiveRecord: 1476.3ms)  
Now: Completed 200 OK in 226ms (Views: 127.2ms | ActiveRecord: 48.7ms)

使用 json 字符串的 Redis 存储 ( see full answer )

最佳答案

如果我没记错的话，IN 的使用可能会非常昂贵。这个怎么样:

caches_action :find_all_artists

def gather_artist_ids
  @all_artists = Artist.all(:select => "id,name)
end

然后，无论您想要在哪里查询:

@downcase_artists = "Joe Schmo, Sally Sue, ..."
@requested_artists = @all_artists.select{|a| @downcase_artists.include?(a)}.collect(&:id)

您可以对 :gather_artist_ids 执行cache_action，并让您的清理器仅触发 after_create、after_update 和 after_destroy。

MongoDB: 我通过 Mongoid 使用 MongoDB，其中有 151 万条记录，正则表达式搜索 /usersinput/i 需要不到 100 毫秒，并在需要时使用索引。速度非常快。

关于sql - 多字符串匹配性能，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/7522564/

sql - 多字符串匹配性能

上一篇：python - 从celery后端(redis)获取数据

下一篇：php - 寻找API优化建议