SASHELP View 与 SQL 字典表的性能

标签 performance sas

为什么 SAS 使用例如 sashelp.vcolumn 与等效的 SQL 表 dictionary.columns 从数据步骤 View 创建数据集需要更长的时间?

我使用 fullstimer 做了一个测试,它似乎证实了我对性能差异的怀疑。

option fullstimer;

data test1;
    set sashelp.vcolumn;
    where libname = 'SASHELP' and
        memname = 'CLASS' and
        memtype = 'DATA';
run;

proc sql;
    create table test2 as
    select *
    from dictionary.columns
    where libname = 'SASHELP' and
        memname = 'CLASS' and
        memtype = 'DATA';
quit;

日志摘录:
NOTE: There were 5 observations read from the data set SASHELP.VCOLUMN.
      WHERE (libname='SASHELP') and (memname='CLASS') and (memtype='DATA');
NOTE: The data set WORK.TEST1 has 5 observations and 18 variables.
NOTE: DATA statement used (Total process time):
      real time           0.67 seconds
      user cpu time       0.23 seconds
      system cpu time     0.23 seconds
      memory              3820.75k
      OS Memory           24300.00k
      Timestamp           04/13/2015 09:42:21 AM
      Step Count                        5  Switch Count  0


NOTE: Table WORK.TEST2 created, with 5 rows and 18 columns.
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.03 seconds
      user cpu time       0.01 seconds
      system cpu time     0.00 seconds
      memory              3267.46k
      OS Memory           24300.00k
      Timestamp           04/13/2015 09:42:21 AM
      Step Count                        6  Switch Count  0

SASHELP 使用的内存稍高,但差异不大。请注意时间——使用 SASHELP 比使用 SQL 字典长 22 倍。当然,这不仅仅是因为内存使用量的差异相对较小。

在@Salva 的建议下,我在新的 SAS session 中重新提交了代码,这次在数据步骤之前运行 SQL 步骤。内存和时间的差异更加明显:
                | sql       | sashelp
----------------+-----------+-----------
real time       | 0.28 sec  | 1.84 sec
user cpu time   | 0.00 sec  | 0.25 sec
system cpu time | 0.00 sec  | 0.24 sec
memory          | 3164.78k  | 4139.53k
OS Memory       | 10456.00k | 13292.00k
Step Count      | 1         | 2
Switch Count    | 0         | 0

最佳答案

其中一些(如果不是全部)是 SQL 和数据步骤之间的开销差异。例如:

proc sql;
    create table test2 as
    select *
    from sashelp.vcolumn
    where libname = 'SASHELP' and
        memname = 'CLASS' and
        memtype = 'DATA';
quit;

也非常快。

SAS page about Dictionary Tables给出了一些可能是主要解释的信息。

When querying a DICTIONARY table, SAS launches a discovery process that gathers information that is pertinent to that table. Depending on the DICTIONARY table that is being queried, this discovery process can search libraries, open tables, and execute views. Unlike other SAS procedures and the DATA step, PROC SQL can mitigate this process by optimizing the query before the discovery process is launched. Therefore, although it is possible to access DICTIONARY table information with SAS procedures or the DATA step by using the SASHELP views, it is often more efficient to use PROC SQL instead.

关于SASHELP View 与 SQL 字典表的性能,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29610861/

相关文章:

javascript - 减少 .resize 和 .scroll 方法的调用次数

c# - Linq 对象 : inner query performance

sas - 如何在 SAS PROC SQL 的 WHERE 子句中使用 DATETIME

sql - SAS PROC SQL 在一个语句中不包含多个值

java - 正确的时钟以提高 Java 的准确性

javascript - 查询性能不佳

c# - 如何在 WCF 操作中创建异步/后台进程?

SAS:在where子句中将日期时间转换为日期

sas - 在 SAS 中读取带有错误位置分隔符的文本文件

windows-xp - windows下使用SAS和mkdir创建目录结构