我有两个正确的表

用户旅程

id  timestamp     bus 
1       00:10      12
1       16:10      12
2       14:00      23

公交车

id   timestamp    price
12   00:00        1.3
12   00:10        1.5
12   00:20        1.7
12   18:00        2.0
13   00:00        3.0

我的目标是找出每个用户今天在旅行上花费了多少。

在我们的案例中，用户在 00:10 乘坐 12 路公交车并支付 1.5，在 16:10 乘坐另一辆公交车，价格上涨至 1.7。这个人今天总共支付了 3.2。我们始终采用最新的更新价格。

我使用大量子查询完成了这项工作，但看起来效率很低。有人有巧妙的解决方案吗？

重现样本数据:

请参阅http://sqlfiddle.com/#!17/10ad6/2

或构建模式:

drop table if exists journeys;
create table journeys(
id numeric,
timestamp timestamp without time zone,
bus numeric
);

truncate table journeys;
insert into journeys
values
(1, '2018-08-22 00:10:00', 12),
(1, '2018-08-22 16:10:00', 12),
(2, '2018-08-22 14:00:00', 23);

-- Bus Prices

drop table if exists bus;
create table bus (
bus_id int,
timestamp timestamp without time zone,
price numeric
);

truncate table bus;
insert into bus
values

(12, '2018-08-22 00:10:00', 1.3),
(12, '2018-08-22 00:10:00', 1.5),
(12, '2018-08-22 00:20:00', 1.7),
(12, '2018-08-22 18:00:00', 2.0),
(13, '2018-08-22 00:00:00', 3.0);

最佳答案

我不知道这比您的解决方案(您没有展示)更快。相关子查询似乎是一个合理的解决方案。

但另一种方法是:

SELECT j.*, b.price
FROM journeys j LEFT JOIN
     (SELECT b.*, LEAD(timestamp) OVER (PARTITION BY bus_id ORDER BY timestamp) as next_timestamp
      FROM bus b
     ) b
     ON b.bus_id = j.bus AND
        j.timestamp >= b.timestamp AND
        (j.timestamp < b.next_timestamp OR b.next_timestamp IS NULL);

关于SQL:在第一个匹配行条件下连接 2 个表，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/51975537/

SQL:在第一个匹配行条件下连接 2 个表

重现样本数据:

上一篇：sql - 如何使用窗口函数枚举我的 Postgres 表中的分区组？

下一篇：sql - 仅选择特定列仅包含特定值的 ID