sql - 如何使用分析函数填充缺失值?

标签 sql oracle oracle11g null oracle12c

我想从我的数据集中填充缺失的空值。我有一个这样的数据集

+---------------------+------+-------------+
| ORDER_DATE          | SHOP | SALESPERSON |
+---------------------+------+-------------+
| 14/04/2017 04:44:27 | A    | MIKE        |
+---------------------+------+-------------+
| 14/04/2017 04:44:55 | A    |             |
+---------------------+------+-------------+
| 14/04/2017 04:45:07 | A    | TIM         |
+---------------------+------+-------------+
| 14/04/2017 04:45:30 | A    |             |
+---------------------+------+-------------+
| 14/04/2017 04:45:43 | B    |             |
+---------------------+------+-------------+
| 14/04/2017 04:46:13 | B    | JOHN        |
+---------------------+------+-------------+
| 14/04/2017 04:46:28 | B    |             |
+---------------------+------+-------------+
| 14/04/2017 04:58:32 | C    |             |
+---------------------+------+-------------+
| 14/04/2017 04:58:41 | C    | MELINDA     |
+---------------------+------+-------------+

我喜欢使用商店内的空值之前的第一个找到的值来填充按商店划分的销售员信息。我试过了,但这不会产生正确的结果(如下)。如何解决这个问题?
CREATE TABLE SALES (
ORDER_DATE DATE, 
SHOP VARCHAR2(30 CHAR), 
SALESPERSON VARCHAR2(30 CHAR)
)
;

REM INSERTING INTO SALES
SET DEFINE OFF;
INSERT INTO SALES (ORDER_DATE,SHOP,SALESPERSON) VALUES (TO_DATE('14/04/2017 04:44:27','DD/MM/YYYY HH24:MI:SS'),'A','MIKE');
INSERT INTO SALES (ORDER_DATE,SHOP,SALESPERSON) VALUES (TO_DATE('14/04/2017 04:44:55','DD/MM/YYYY HH24:MI:SS'),'A',NULL);
INSERT INTO SALES (ORDER_DATE,SHOP,SALESPERSON) VALUES (TO_DATE('14/04/2017 04:45:07','DD/MM/YYYY HH24:MI:SS'),'A','TIM');
INSERT INTO SALES (ORDER_DATE,SHOP,SALESPERSON) VALUES (TO_DATE('14/04/2017 04:45:30','DD/MM/YYYY HH24:MI:SS'),'A',NULL);
INSERT INTO SALES (ORDER_DATE,SHOP,SALESPERSON) VALUES (TO_DATE('14/04/2017 04:45:43','DD/MM/YYYY HH24:MI:SS'),'B',NULL);
INSERT INTO SALES (ORDER_DATE,SHOP,SALESPERSON) VALUES (TO_DATE('14/04/2017 04:46:13','DD/MM/YYYY HH24:MI:SS'),'B','JOHN');
INSERT INTO SALES (ORDER_DATE,SHOP,SALESPERSON) VALUES (TO_DATE('14/04/2017 04:46:28','DD/MM/YYYY HH24:MI:SS'),'B',NULL);
INSERT INTO SALES (ORDER_DATE,SHOP,SALESPERSON) VALUES (TO_DATE('14/04/2017 04:58:32','DD/MM/YYYY HH24:MI:SS'),'C',NULL);
INSERT INTO SALES (ORDER_DATE,SHOP,SALESPERSON) VALUES (TO_DATE('14/04/2017 04:58:41','DD/MM/YYYY HH24:MI:SS'),'C','MELINDA');
COMMIT;

SELECT * FROM SALES ORDER BY SHOP, ORDER_DATE;

SELECT ORDER_DATE,
       SHOP,
       SALESPERSON,
       /*tried two approaches*/
       /*does not produce a correct result set*/
       LAST_VALUE(SALESPERSON) IGNORE NULLS OVER (PARTITION BY SHOP
                   ORDER BY ORDER_DATE RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS LAST_VALUE_1,
       /*this also does not solve this*/            
       LAST_VALUE(SALESPERSON) IGNORE NULLS OVER(PARTITION BY SHOP
                  ORDER BY ORDER_DATE ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS LAST_VALUE_2
FROM SALES ;

正确的结果集是:
+---------------------+------+-------------+--------------------+
| ORDER_DATE          | SHOP | SALESPERSON | SALESPERSON_FILLED |
+---------------------+------+-------------+--------------------+
| 14/04/2017 04:44:27 | A    | MIKE        |  MIKE              |
+---------------------+------+-------------+--------------------+
| 14/04/2017 04:44:55 | A    |             |  MIKE              |
+---------------------+------+-------------+--------------------+
| 14/04/2017 04:45:07 | A    | TIM         |  TIM               |
+---------------------+------+-------------+--------------------+
| 14/04/2017 04:45:30 | A    |             |  TIM               |
+---------------------+------+-------------+--------------------+
| 14/04/2017 04:45:43 | B    |             |                    |
+---------------------+------+-------------+--------------------+
| 14/04/2017 04:46:13 | B    | JOHN        |  JOHN              |
+---------------------+------+-------------+--------------------+
| 14/04/2017 04:46:28 | B    |             |  JOHN              |
+---------------------+------+-------------+--------------------+
| 14/04/2017 04:58:32 | C    |             |                    |
+---------------------+------+-------------+--------------------+
| 14/04/2017 04:58:41 | C    | MELINDA     |  MELINDA           |
+---------------------+------+-------------+--------------------+

最佳答案

你非常接近。
尝试这个:

SELECT ORDER_DATE,
       SHOP,
       SALESPERSON,

       LAST_VALUE(SALESPERSON) IGNORE NULLS OVER 
            (PARTITION BY SHOP ORDER BY ORDER_DATE ) AS LAST_VALUE_1

FROM SALES
order by shop, order_date;
ORDER_DA SHOP                           SALESPERSON                    LAST_VALUE_1                  
-------- ------------------------------ ------------------------------ ------------------------------
17/04/14 A                              MIKE                           MIKE                          
17/04/14 A                                                             MIKE                          
17/04/14 A                              TIM                            TIM                           
17/04/14 A                                                             TIM                           
17/04/14 B                                                                                           
17/04/14 B                              JOHN                           JOHN                          
17/04/14 B                                                             JOHN                          
17/04/14 C                                                                                           
17/04/14 C                              MELINDA                        MELINDA                       

9 rows selected. 

关于sql - 如何使用分析函数填充缺失值?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43405697/

相关文章:

sql - 如何在不知道 SQL Server 中的架构的情况下从存储过程填充临时表

mysql - 分组连接以包含相应的数据

oracle - 存储过程性能 : DataNotFound or External Cursor

oracle - 为什么 Oracle 对此查询使用跳过扫描?

SQL 如何在 SUM 中使用 LAG

java - Oracle 11g + hibernate -> ORA-01461 : can bind a LONG value only for insert into a LONG column

sql - 防止 SQL 事务中的竞争条件以确保数据完整性

c# - 十进制 DbParameter 的精度问题

sql - 在 Oracle SQL (11g) 中选择所有带有 "Now minus 30 minutes"的行

sql - order by 与 union all 子句的有趣行为