sql - hive/sql 如何检查列值的多次重复并聚合这些值或显示列范围

这个问题和我之前问的不同，请通读不同的逻辑

假设我有一个如下表，有 6 列

name    orderno productcategory amount  description code
KJFSFKS 1   1   40  D1  x1
KJFSFKS 2   2   50  D2  y1
KJFSFKS 3   2   67  D3  b1
KJFSFKS 4   2   10  D4  a1
KJFSFKS 5   3   2   D5  ws1
KJFSFKS 6   3   5   D6  ks1
KJFSFKS 7   3   6   D7  pw3
KJFSFKS 8   4   8   D8  ju7
KJFSFKS 9   5   8   D9  87y
KJFSFKS 10  5   10  D10 ky9

产品类别有多次重复值“2”、“3”和“5”

我的逻辑是，如果product栏有多次重复，将amount栏相加，选择高orderno对应的description，

将相同产品类别代码的低和高 orderno 代码列与“-”连接(基本上是范围)

否则如果没有多次重复 productcategory 列(值“1”和“4”)，则直接移动所有值而不进行任何聚合

所以输出如下所示

name    orderno productcategory amount  description code
KJFSFKS 1   1   40  D1  x1
KJFSFKS 2   2   127 D4  y1-a1
KJFSFKS 5   3   13  D7  ws1-pw3
KJFSFKS 8   4   8   D8  ju7
KJFSFKS 9   5   18  D10 87y-ky9

最佳答案

    select name, orderno, productcategory, amount, description, 
           case when codelow=codehigh then codelow else concat(codelow,'-',codehigh) end as code
    from
    (
    select name, orderno, productcategory, sum(amount) over(partition by name, productcategory) amount, 
    first_value(description) over(partition by name, productcategory order by orderno desc) description,
    first_value(code) over(partition by name, productcategory order by orderno desc) codehigh,
    first_value(code) over(partition by name, productcategory order by orderno asc) codelow,
    row_number() over (partition by name, productcategory order by orderno) rn
    from your_table
    )s where rn=1;

OK
KJFSFKS 1       1       40      D1      x1
KJFSFKS 2       2       127     D4      y1-a1
KJFSFKS 5       3       13      D7      ws1-pw3
KJFSFKS 8       4       8       D8      ju7
KJFSFKS 9       5       18      D10     87y-ky9
Time taken: 12.638 seconds, Fetched: 5 row(s)

关于sql - hive/sql 如何检查列值的多次重复并聚合这些值或显示列范围，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/45332008/

sql - hive/sql 如何检查列值的多次重复并聚合这些值或显示列范围

上一篇：shell - Oozie shell 操作

下一篇：hadoop - 控制 map 的数量并减少产生的工作？