sql - GROUP BY Apache Hive中的错误

标签 sql hadoop group-by hive

我有一个包含以下内容的表格评分:

userid INT
movieid INT
rating FLOAT
timestmp STRING

select movieid, ROUND(AVG(rating),1) as Rating, COUNT(userid) as rtn_crt, ROUND(((Rating*rtn_cnt)+(100*3.5))/(rtn_cnt+100),1) as w_rating
from ratings 
GROUP BY movieid 
LIMIT 50;

错误信息:

org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException [Error 10025]: Line 2:6 Expression not in GROUP BY key rtn_cnt



我尝试使用CAST函数,但仍无法正常工作,并收到相同的错误
select movieid, CAST(AVG(rating) AS FLOAT) as Rating, COUNT(userid) as rtn_crt,
CAST((Rating*rtn_cnt) AS FLOAT) + CAST((100*$AVG_MEAN) AS FLOAT)
       /CAST((rtn_cnt+100) AS FLOAT) as w_rating
from ratings 
GROUP BY movieid 
LIMIT 50;

最佳答案

我应该建议您使用子查询来重写SQL查询,如下所示:

SELECT t.*, ROUND(((t.Rating*t.rtn_cnt)+(100*3.5))/(t.rtn_cnt+100),1) as w_rating 
FROM (
    SELECT movieid, ROUND(AVG(rating), 1) as Rating, COUNT(userid) as rtn_crt
    FROM ratings 
    GROUP BY movieid 
    LIMIT 50
) t;

关于sql - GROUP BY Apache Hive中的错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44006815/

相关文章:

mysql - 更新插入而不使用重复键 MYSQL

hadoop - 容器运行超出物理内存。 Hadoop 流 python MR

hadoop - 在 Oozie 中为 Map-Reduce 作业指定驱动程序

python - 如何对两个字段进行分组并将索引设置为两个字段之一。 Pandas ,Python-3

postgresql - SUM 不同 CASE 条件下的相同字段

sql-server - T-SQL 是否有用于连接字符串的聚合函数?

php - mysql查询帮助?

mysql - SQL 选择跨多个列的类别计数

java - Hadoop 上的错误 : Could not find or load main class org. apache.hadoop.hdfs.tools.GetConf

mysql - sql MySQL 错误 (1241) 操作数应包含 1 列