我试图取某个类别值的平均值,其中行按子类别分组并计算总和。父表的主键是子表的分组属性。父表的分组属性既不是主键,也不是子表中的。
简单表示:
select Category, avg(CalculatedSum)
from ParentTable pt
inner join (
select Subcategory, sum(Quantity * Price) as 'CalculatedSum'
from ChildTable
group by Subcategory
) ct
on pt.ID = ct.Subcategory
group by Category
实际的SQL如下:
select c.CU_AGE_RANGE, count(*) as '# of Customers', avg(SumSales) as 'Avg of SumSales', max([Max of SumSales]) as 'Max of SumSales', min([Min of SumSales]) as 'Min of SumSales'
from Customers c
inner join (
select CUSTOMER_ID, sum(QTY_SOLD * SALES) as SumSales, max(QTY_SOLD*SALES) as 'Max of SumSales', min(QTY_SOLD*SALES) as 'Min of SumSales'
from Sales
where (SALES > 0) and (QTY_SOLD > 0) and (COST > 0)
Group by CUSTOMER_ID
) s
on c.CUSTOMER_ID = s.CUSTOMER_ID
group by c.CU_AGE_RANGE
我尝试将 group by 子句更改为类别 (CU_AGE_RANGE) 和子类别 (CUSTOMER_ID) 的各种顺序,但始终遇到相同的错误。
错误在于该表将始终向我显示总和的总和(我相信)。我假设这是错误,因为子表中的典型平均值为 250 到 1000,并且 Avg(Sum()) 返回的值大致为每个类别的行数乘以预期的 Sum()。
由于声誉较低,我无法发布照片,因此请参阅以下逗号分隔结果表:
CU_AGE_RANGE,#_of_Customers,Avg_of_SumSales,Max_of_SumSales,Min_of_SumSales
NULL,125,4261665.306,433460737.7,0.0017
20-29 ,1192,1154040.907,1374037708,0.00025
30-39 ,1902,25429.52329,29426212.64,0.00015
40-49 ,2118,2418.829874,2066725,0.0001
50-59 ,2204,114625.4111,248240261.3,0.00015
60+ ,2135,160156.4341,334617675,0.0005
patrickbig,1,65.5737,12,0.06
Under 19 ,484,1431.262112,92160,0.0001
我试图弄清楚为什么 AVG(SUM()) 返回的内容似乎是 SUM(SUM())。我当前的预感是,由于 SUM() 是计算条目,因此计算值是根据父表中的分组重新计算的。所以这将是:
期望:
x * y for each row in Child Table
sum(x*y) for each Subcategory
Avg(sum(x/y)) for each Category of Subcategory
QTY_SOLD * SALE for each row in Sales
sum(QTY_SOLD*SALE) for each CUSTOMER_ID
avg(sum(QTY_SOLD*SALE) for each CU_AGE_RANGE group of CUSTOMER_IDs
实际:
x * y for each row in Child Table
sum(x * y) for each Subcategory
avg(sum(x * y) for each Category
avg(sum(QTY_SOLD*SALE) for each CU_AGE_RANGE
等于:
sum(QTY_SOLD*SALE) for each CU_AGE_RANGE
如何从当前(类别总和)到所需(按类别的子类别总和的平均值)?
最佳答案
您的客户计数有误。您计算的是销售额,而不是客户数量。更改为 count( DISTINCT c.CUSTOMER_ID )
应该可以解决问题。
select c.CU_AGE_RANGE, count( DISTINCT c.CUSTOMER_ID ) as '# of Customers', avg(SumSales) as 'Avg of SumSales', max([Max of SumSales]) as 'Max of SumSales', min([Min of SumSales]) as 'Min of SumSales'
from Customers c
inner join (
select CUSTOMER_ID, sum(QTY_SOLD * SALES) as SumSales, max(QTY_SOLD*SALES) as 'Max of SumSales', min(QTY_SOLD*SALES) as 'Min of SumSales'
from Sales
where (SALES > 0) and (QTY_SOLD > 0) and (COST > 0)
Group by CUSTOMER_ID
) s
on c.CUSTOMER_ID = s.CUSTOMER_ID
group by c.CU_AGE_RANGE
关于mysql - 计算属性总和的 SQL 平均值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29000035/