我正在学习在 SAS 中使用基本级别的 SQL 子查询。面对当前的问题,我使用子查询来查询同一个表 - 员工信息表。本质上,我想编写一个查询来列出两个年龄组(50 岁以下和 50 岁以上)中当前员工的平均工资,这些员工的收入高于相应年龄组的平均工资。有一个 case when 语句似乎是一个已被注释掉的问题。如果我仅将代码用于一个年龄组,它工作得很好,但是当我尝试容纳两个年龄组时,子查询将返回该列的整体平均值,而不是基于员工年龄的条件平均值。谁能告诉我我哪里出了问题吗?在 FROM 子句中使用子查询会是更好的方法吗?
proc sql;
SELECT
avg(CASE WHEN ((intck('DAY',Birth_Date,'01JAN15'd) / 365.25) < 50) THEN Salary END)
AS Under50_Age 'Under 50' format = dollar10.2,
avg(CASE WHEN ((intck('DAY',Birth_Date,'01JAN15'd) / 365.25) >= 50) THEN Salary END)
AS Over50_Age 'Over 50' format = dollar10.2
FROM orion.employee_information
WHERE Salary > (
SELECT
avg(CASE
WHEN ((intck('DAY',Birth_Date,'01JAN15'd) / 365.25) < 50) THEN Salary
/*WHEN ((intck('DAY',Birth_Date,'01JAN15'd) / 365.25) >= 50) THEN Salary*/
END)
FROM orion.employee_information
WHERE
Employee_Term_Date IS missing)
AND Employee_Term_Date IS missing;
quit;
最佳答案
您正在旋转查询中的数据。自然的 SQL 分组聚合是每个 BY 组一行,并且您的选择尝试从不同的 BY 组创建不同的列。
自然的查询结果不需要子查询。使用 case 语句创建分组依据的分类值。计算整个组的平均工资。
average salaries for current employees in two age groups (Under 50 and Over 50) who earn more than the average salary
典型的方法是在计算的分类 (age_group
) 上连接两个嵌套查询。
示例:
嵌套查询类似,第一个查询别名为group_computation
,第二个查询别名为all
。两个嵌套查询都会计算age_group 并将其自身限制为在职员工。
使用streaminit
例程和rand
函数的目的是希望无论SAS安装如何,创建的数据都相同。
data have;
call streaminit(123);
do empid = 1 to 1000;
birth_date = intnx('YEAR', '01JAN15'd, -18 - rand('unif', 51)) - rand('unif',365);
if ranuni(123) < 0.15 then
term_date = intnx("year", birth_date, 18 + rand('unif', 30));
else
term_date = .;
salary = round(1000 * (30 + rand('unif', 75)),25);
output;
end;
format birth_date term_date yymmdd10.;
run;
proc sql;
select
group_computation.age_group
, mean(all.salary) as avg_of_above_average_salaries format=dollar10.
, count(*) as cnt_of_above_average_salaries
, group_computation.avg_salary
, group_computation.emp_count as avg_salary_count
from
(
select
case
when round(intck('DAY',Birth_Date,'01JAN15'd) / 365.25, 0.0001) < 50 then 'under 50'
else 'over 50'
end
as age_group
, mean (salary) as avg_salary format = dollar10.
, count(*) as emp_count
from
have
where
term_date is missing
group by
age_group
) as group_computation
join
(
select
case
when round(intck('DAY',Birth_Date,'01JAN15'd) / 365.25, 0.0001) < 50 then 'under 50'
else 'over 50'
end
as age_group
, salary
from
have
where
term_date is missing
) as all
on
group_computation.age_group = all.age_group
where
all.salary > group_computation.avg_salary
group by
group_computation.age_group, group_computation.avg_salary, group_computation.emp_count
;
quit;
生成结果集
avg_of_above_ cnt_of_above_
average_ average_ avg_salary_
age_group salaries salaries avg_salary count
-------------------------------------------------------------------
over 50 $88,201 164 $68,984 328
under 50 $85,851 261 $66,519 546
关于sql - SAS SQL 中带有条件语句的子查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58515319/