ruby-on-rails - 如何将排序移动到数据库级别

标签 ruby-on-rails performance postgresql

我有一个 Rails 应用程序,它使用 postgresql 作为数据库,按位置对不同类型的用户进行排序,然后按他们在网站上的各种事件获得的声誉点数进行排序。这是一个示例查询

 @lawyersbylocation = User.lawyers_by_province(province).sort_by{ |u| -u.total_votes }

查询在 User.rb 模型上调用作用域 lawyers_by_province:

 scope :lawyers_by_province, lambda {|province|
  joins(:contact).
  where( contacts: {province_id: province},
         users: {lawyer: true})

  }

然后,仍然在 User.rb 模型上,计算他们拥有的信誉点。

 def total_votes
    answerkarma = AnswerVote.joins(:answer).where(answers: {user_id: self.id}).sum('value') 
    contributionkarma = Contribution.where(user_id: self.id).sum('value')
    bestanswer = BestAnswer.joins(:answer).where(answers: {user_id: self.id}).sum('value') 
    answerkarma + contributionkarma + bestanswer
 end

有人告诉我,如果站点达到一定数量的用户,那么它会变得非常慢,因为它是在 Ruby 中排序而不是在数据库级别。我知道评论指的是 total_votes 方法,但我不确定 lawyers_by_province 是在数据库级别还是在 ruby​​ 中发生,因为它使用 Rails 助手来查询数据库。对我来说似乎是两者的结合,但我不确定这对效率的影响。

您能告诉我如何编写此代码,以便查询在数据库级别发生,从而以更有效的方式进行,而不会破坏我的网站吗?

更新 以下是total_votes方法中模型的三种方案。

 create_table "answer_votes", force: true do |t|
    t.integer  "answer_id"
    t.integer  "user_id"
    t.integer  "value"
    t.boolean  "lawyervote"
    t.boolean  "studentvote"
    t.datetime "created_at"
    t.datetime "updated_at"
  end

  add_index "answer_votes", ["answer_id"], name: "index_answer_votes_on_answer_id", using: :btree
  add_index "answer_votes", ["lawyervote"], name: "index_answer_votes_on_lawyervote", using: :btree
  add_index "answer_votes", ["studentvote"], name: "index_answer_votes_on_studentvote", using: :btree
  add_index "answer_votes", ["user_id"], name: "index_answer_votes_on_user_id", using: :btree



create_table "best_answers", force: true do |t|
    t.integer  "answer_id"
    t.integer  "user_id"
    t.integer  "value"
    t.datetime "created_at"
    t.datetime "updated_at"
    t.integer  "question_id"
  end

  add_index "best_answers", ["answer_id"], name: "index_best_answers_on_answer_id", using: :btree
  add_index "best_answers", ["user_id"], name: "index_best_answers_on_user_id", using: :btree



create_table "contributions", force: true do |t|
    t.integer  "user_id"
    t.integer  "answer_id"
    t.integer  "value"
    t.datetime "created_at"
    t.datetime "updated_at"
  end

  add_index "contributions", ["answer_id"], name: "index_contributions_on_answer_id", using: :btree
  add_index "contributions", ["user_id"], name: "index_contributions_on_user_id", using: :btree

此外, 这是联系方案,其中包含 province_id 在 user.rb 模型的 lawyers_by_province 范围内使用

  create_table "contacts", force: true do |t|
    t.string   "firm"
    t.string   "address"
    t.integer  "province_id"
    t.string   "city"
    t.string   "postalcode"
    t.string   "mobile"
    t.string   "office"
    t.integer  "user_id"
    t.string   "website"
    t.datetime "created_at"
    t.datetime "updated_at"
  end

更新 尝试应用@Shawn 的答案,我将此方法放在 user.rb 模型中

 def self.total_vote_sql
    "( " +
    [
     AnswerVote.joins(:answer).select("user_id, value"),
     Contribution.select("user_id, value"),
     BestAnswer.joins(:answer).select("user_id, value")
    ].map(&:to_sql) * " UNION ALL " + 
    ") as total_votes "
  end

然后在 Controller 中,我这样做了(将 User 放在 total_vote_sql 前面)

@lawyersbyprovince = User.select("users.*, sum(total_votes.value) as total_votes").joins("left outer join #{User.total_vote_sql} on users.id = total_votes.user_id").
                            order("total_votes desc").lawyers_by_province(province)

它给我这个错误

ActiveRecord::StatementInvalid in LawyerProfilesController#index

PG::Error: ERROR: column reference "user_id" is ambiguous LINE 1: ..."user_id" = "users"."id" left outer join ( SELECT user_id, v... ^ : SELECT users.*, sum(total_votes.value) as total_votes FROM "users" INNER JOIN "contacts" ON "contacts"."user_id" = "users"."id" left outer join ( SELECT user_id, value FROM "answer_votes" INNER JOIN "answers" ON "answers"."id" = "answer_votes"."answer_id" UNION ALL SELECT user_id, value FROM "contributions" UNION ALL SELECT user_id, value FROM "best_answers" INNER JOIN "answers" ON "answers"."id" = "best_answers"."answer_id") as total_votes on users.id = total_votes.user_id WHERE "contacts"."province_id" = 6 AND "users"."lawyer" = 't' ORDER BY total_votes desc

更新 在对 Shawn 的帖子应用编辑后,错误消息现在是这样的:

PG::Error: ERROR: column reference "user_id" is ambiguous LINE 1: ..."user_id" = "users"."id" left outer join ( SELECT user_id as... ^ : SELECT users.*, sum(total_votes.value) as total_votes FROM "users" INNER JOIN "contacts" ON "contacts"."user_id" = "users"."id" left outer join ( SELECT user_id as tv_user_id, value FROM "answer_votes" INNER JOIN "answers" ON "answers"."id" = "answer_votes"."answer_id" UNION ALL SELECT user_id as tv_user_id, value FROM "contributions" UNION ALL SELECT user_id as tv_user_id, value FROM "best_answers" INNER JOIN "answers" ON "answers"."id" = "best_answers"."answer_id") as total_votes on users.id = total_votes.tv_user_id WHERE "contacts"."province_id" = 6 AND "users"."lawyer" = 't' ORDER BY total_votes desc

最佳答案

警告:我是 Rails 的新手,但这是我保持理智的技术,同时出于性能原因需要不断直接访问数据库,我需要一直这样做,因为你只能有两个以下

  1. 批量数据处理
  2. 纯 Rails 技术
  3. 表现良好

无论如何,一旦您需要采用这些混合方法(部分是 ruby​​ 部分是 SQL),我觉得您不妨全力以赴并选择纯 SQL 解决方案。

  1. 更容易调试,因为您可以更有效地隔离两个代码层。
  2. 优化 SQL 更容易,因为如果这不是您的强项,您更有可能让专门的 SQL 人员为您查看它。

我认为您在这里寻找的 SQL 是这样的:

with cte_scoring as (
  select
    users.id,
    (select Coalesce(sum(value),0) from answer_votes  where answer_votes.user_id  = users.id) +
    (select Coalesce(sum(value),0) from best_answers  where best_answers.user_id  = users.id) +
    (select Coalesce(sum(value),0) from contributions where contributions.user_id = users.id) total_score
  from
    users join
    contacts on (contacts.user_id = users.id)
  where
    users.lawyer         = 'true'          and
    contacts.province_id = #{province.id})
select   id,
         total_score
from     cte_scoring
order by total_score desc
limit    #{limit_number}

这应该会为您提供最佳性能——SELECT 中的子查询并不理想,但该技术确实对您检查分数的 user_id 应用过滤。

集成到 Rails 中:如果您将 sql_string 定义为 SQL 代码:

scoring = ActiveRecord::Base.connection.execute sql_string

...然后你会得到一个哈希数组,你可以像这样使用它:

scoring.each do |lawyer_score|
  lawyer = User.find(lawyer_score["id"])
  score  = lawyer_score["total_score"]
  ...
end

关于ruby-on-rails - 如何将排序移动到数据库级别,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16613005/

相关文章:

ruby-on-rails - 在使用 WebMock 时遇到问题,没有正确 stub

ruby-on-rails - 不使用 Postgres 时为 "pg is not part of the bundle. Add it to Gemfile"

performance - Perl Goatse 'Secret Operator' 有效吗?

postgresql - Postgres 查询 : array_to_string with empty values

php - PostGIS 函数集成在 php 示例中?

ruby-on-rails - 在单个 POST HTTP 请求中发送一组不同的哈希值

ruby-on-rails - Ruby yaml 无法读取 unicode

c - 在 O(n) 时间内对输入的字母进行排序

python - 提高掩蔽性能,然后加权平均

sql - 有效地使用另一个表的聚合更新表