Rails 3.1、Ruby 1.9.2、AR/MySQL。
我正在寻找有关如何在每个时间段(天)内仅保留 1 个结果的建议,如果同一类型的结果在该时间段内有很多结果。一个例子可能是跟踪股票价格。最初,我们每 15 分钟保存一次价格,但只需要将每个价格点存储 1 周。第一周后,我们每天只需要一个价格(最后记录,收盘价)。
这是一个简单的第一次尝试,它确实有效,但效率极低:
# stock has many prices, price has one stock
# get all prices for single stock older than 1 week
prices = stock.prices.where("created_at < ? ", Time.now-1.week)
prices.group_by{ |price| price.created_at.to_date }.each do |k,v| # group by day
if v.count > 1 # if many price points that day
(v[0]..v[v.size-2]).each {|r| r.delete} # delete all but last record in day
end
end
提前感谢您提供的任何帮助/建议。我会在完成过程中尝试更新,希望它能对以后的人有所帮助。
最佳答案
您可以通过在 SQL 中完成所有操作并将范围限制为上次运行的时间来提高效率。此外,如果您添加一列以将较旧的日终条目标记为“已存档”,那么它会使查询变得更加简单。存档价格是您不会在一周后删除的价格。
rails generate migration add_archived_to_prices archived:boolean
在迁移之前,将迁移修改为在 created_at 列上建立索引。
class AddArchivedToPrices < ActiveRecord::Migration
def self.up
add_column :prices, :archived, :boolean
add_index :prices, :created_at
end
def self.down
remove_index :prices, :created_at
remove_column :prices, :archived
end
end
工作流程是这样的:
# Find the last entry for each day for each stock using SQL (more efficient than finding these in Ruby)
keepers =
Price.group('stock_id, DATE(created_at)').
having('created_at = MAX(created_at)').
select(:id).
where('created_at > ?', last_run) # Keep track of the last run time to speed up subsequent runs
# Mark them as archived
Price.where('id IN (?)', keepers.map(&:id)).update_all(:archived => true)
# Delete everything but archived prices that are older than a week
Price.where('archived != ?', true).
where('created_at < ?", Time.now - 1.week).
where('created_at > ?', last_run). # Keep track of the last run time to speed up subsequent runs
delete_all
最后要注意的是,一定不要将 group()
和 update_all()
组合在一起。 group()
被 update_all()
忽略。
关于ruby-on-rails - Rails 每天只保存众多记录中的 1 条。保留最后,删除其余,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10377784/