hadoop - HIVE多组依负运算

标签 hadoop hive

我正在尝试将这两个选择语句加入id和playerID以及year和yearID(下面的模式)。还要减去别名HAB-EG还要在两个select语句中按年份和id分组,以便在层次结构的后面进行除法和减法运算之前对值求和。它说按G分组,当我尝试此操作时,似乎很奇怪。我不需要按G,ID和年份分组,因为一个玩家可以在表中有多个条目,我们需要在计算之前对G,E H和AB进行汇总

  Try this:

SELECT
    a.playerID AS ID,
    a.yearID AS yearID,
    (b.HAB - a.EG) AS `HAB-EG`
FROM 
    (SELECT
        SUM(playerID),
        SUM(yearID),
        (E/G) AS EG
    FROM fielding
    WHERE (
            yearID > 2005
            AND yearID < 2009
            AND G > 20 
            )GROUP BY playerID,yearID
    ) AS a
JOIN
    (SELECT
        SUM(id),
        SUM(year),
        (hits/ab) AS HAB
    FROM batting
    WHERE( 
            year > 2005
            AND year < 2009 
            AND ab > 40 
            ) GROUP BY id,year

    ) AS b ON a.playerID = b.id AND a.yearID = b.year;

希玛
CREATE EXTERNAL TABLE IF NOT EXISTS fielding
(playerID STRING ,yearID INT ,teamID STRING ,lgID STRING ,
POS STRING ,G INT ,GS INT , InnOuts INT , PO INT,A INT, E INT,  
DP INT , PB INT , WP INT ,SB INT ,CS INT , ZR INT ) ROW
FORMAT DELIMITED FIELDS TERMINATED BY ',' LOCATION      '/home/hduser/hivetest/fielding';

只是模式
 CREATE EXTERNAL TABLE IF NOT EXISTS batting(id STRING, year INT, team STRING,
 league STRING, games INT, ab INT, runs INT, hits INT, doubles INT, triples
 INT, homeruns INT, rbi INT, sb INT, cs INT, walks INT, strikeouts INT, ibb
 INT, hbp INT, sh INT, sf INT, gidp INT) ROW FORMAT DELIMITED FIELDS
 TERMINATED BY ',' LOCATION '/home/hduser/hivetest/batting';

最佳答案

试试这个:

SELECT
    a.playerID AS ID,
    a.yearID AS yearID,
    (b.HAB - a.EG) AS `HAB-EG`
FROM 
    (SELECT
        playerID,
        yearID,
        (SUM(E)/SUM(G)) AS EG
    FROM fielding
    WHERE (
            yearID > 2005
            AND yearID < 2009
            AND G > 20 
            )GROUP BY playerID,yearID
    ) AS a
JOIN
    (SELECT
        id,
        year,
        (SUM(hits)/SUM(ab)) AS HAB
    FROM batting
    WHERE( 
            year > 2005
            AND year < 2009 
            AND ab > 40 
            ) GROUP BY id,year

    ) AS b ON a.playerID = b.id AND a.yearID = b.year;

关于hadoop - HIVE多组依负运算,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36834544/

相关文章:

hadoop - 使用 jar 命令执行 WordCount 程序时 hadoop 中出现 "No such file or directory"

hadoop - 如何将数据从数据库A的Hive表加载到数据库B的Hive表中?

Hadoop 的 NameNode 和 DataNode Service 没有运行在 single_mode

apache-spark - 从TF-YARN库创建pex进行分布式培训时出错

sql - 从 Hive 表中选择大量 id

hadoop - Apache Flume 1.5 未在 Hadoop 2/自动故障转移集群配置中给出预期结果

azure - 表删除时 HDFS 内存不会删除 HIVE

hadoop - hive 中 ORDER BY 的替代方案

hadoop - 如何扩展行中的数组值!!使用配置单元 SQL

php - 如何将 PHP 与 HIVE 连接?