ssis - 如何清理 SSISDB?

标签 ssis sql-server-2012

当我设置它时,我忽略了保留期。我的数据库变得非常大,所以我想减小它的大小。如果我只是更改保留期(它是 365),则会导致 SSIS 运行我的包时出现问题。我什至以小增量更改它,但删除语句会创建锁,从而阻止新作业运行。

任何想法如何解决这个问题?我想过创建一个新的 SSISDB。

最佳答案

Phil Brammer 遇到了这个问题以及与 SSIS 目录的护理和喂养相关的许多其他事情,他在他的帖子 Catalog Indexing Recommendations 中介绍了这些内容。 .
根本问题
根本问题是 MS 试图在设计 SSIS 时考虑到 RI,但他们很懒惰,并且允许发生级联删除而不是显式处理它们。

Out of the box, the new SSIS 2012 catalog database (SSISDB) has some basic indexing applied, with referential integrity set to do cascade deletes between most tables.


Enter the SQL Agent job, “SSIS Server Maintenance Job.” This job by default is set to run at midnight daily, and uses two catalog parameters to function: “Clean Logs Periodically” and “Retention Period (days).” When these are set, the maintenance job purges any data outside of the noted retention period.


This maintenance job deletes, 10 records at a time in a loop, from internal.operations and then cascades into many tables downstream. In our case, we have around 3000 operations records to delete daily (10 at a time!) that translates into 1.6 million rows from internal.operation_messages. That’s just one downstream table! This entire process completely, utterly locks up the SSISDB database from any SELECT/INSERT data


解析度
在 MS 改变工作方式之前,支持的选项是

move the maintenance job schedule to a more appropriate time for your environment


我知道在我当前的客户中,我们只在凌晨加载数据,所以 SSISDB 在工作时间是安静的。
如果在安静时期运行维护作业不是一种选择,那么您正在考虑制作自己的删除语句,以尝试减少级联删除的影响。
在我当前的客户中,在过去 10 个月中,我们每晚运行大约 200 个软件包,并且也有 365 天的历史。我们最大的表,按数量级是。
Schema    Table                   RowCount
internal  event_message_context   1,869,028
internal  operation_messages      1,500,811
internal  event_messages          1,500,803
所有这些数据的驱动程序,internal.operations其中只有 3300 行,这与 Phil 关于此数据呈指数增长的评论一致。
因此,识别 operation_id被清除并从叶表中删除回到核心,internal.operations table 。
USE SSISDB;
SET NOCOUNT ON;
IF object_id('tempdb..#DELETE_CANDIDATES') IS NOT NULL
BEGIN
    DROP TABLE #DELETE_CANDIDATES;
END;

CREATE TABLE #DELETE_CANDIDATES
(
    operation_id bigint NOT NULL PRIMARY KEY
);

DECLARE @DaysRetention int = 100;
INSERT INTO
    #DELETE_CANDIDATES
(
    operation_id
)
SELECT
    IO.operation_id
FROM
    internal.operations AS IO
WHERE
    IO.start_time < DATEADD(day, -@DaysRetention, CURRENT_TIMESTAMP);

DELETE T
FROM
    internal.event_message_context AS T
    INNER JOIN
        #DELETE_CANDIDATES AS DC
        ON DC.operation_id = T.operation_id;

DELETE T
FROM
    internal.event_messages AS T
    INNER JOIN
        #DELETE_CANDIDATES AS DC
        ON DC.operation_id = T.operation_id;

DELETE T
FROM
    internal.operation_messages AS T
    INNER JOIN
        #DELETE_CANDIDATES AS DC
        ON DC.operation_id = T.operation_id;

-- etc
-- Finally, remove the entry from operations

DELETE T
FROM
    internal.operations AS T
    INNER JOIN
        #DELETE_CANDIDATES AS DC
        ON DC.operation_id = T.operation_id;
通常的警告适用
  • 不要相信互联网上随机的代码
  • 使用 ssistalk 和/或系统表中的图表来识别所有依赖项
  • 您可能只需要将删除操作分割为较小的操作
  • 您可能会通过删除操作的 RI 受益,但一定要使用检查选项重新启用它们,以便它们受到信任。
  • 如果操作持续时间超过 4 小时,请咨询您的 dba

  • 2020年7月编辑
    Tim Mitchell 在 SSIS Catalog Automatic Cleanup 上有很多文章和 A better way to Clean up the SSIS Catalog Database和他的新书The SSIS Catalog: Install, Manage, Secure and Monitor Your Enterprise ETL Infrastructure
    @Yong Jun Kim在评论中指出

    There is a chance SSIS DB might have different table names with scaleout at the end now. Instead of internal.event_message_context it can be internal.event_message_context_scaleout. Instead of internal.operations_messages, it can be internal.operations_messages_scaleout. Just modify the table names in the code accordingly, and it should run fine


    如果您在 Azure 数据工厂中使用 SSIS IR,则肯定是这种情况。您会发现“普通”表仍然存在但为空,并带有 *_scaleout包含所有数据的版本。
    引用
  • Catalog Indexing Recommendations
  • Beware the SSIS Server Maintenance Job
  • Slow performance when you run the SSIS Server Maintenance Job to remove old data in SQL Server 2012
  • 关于ssis - 如何清理 SSISDB?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21781351/

    相关文章:

    sql - 有没有办法在整个过程中持久保存变量?

    sql-server - SSIS导入Excel数据

    SQL 查询以选择在另一个表的字段中具有匹配子字符串的记录

    sql-server - SSIS 中的波浪号 (~) 分隔文件读取

    oracle - 对连接管理器 Oracle Connector 的 AcquireConnection 方法调用失败

    sql-server - x64 中的 SSDT-BI SSIS?

    sql - 使用 SQL-Server 中的格式将 nvarchar 转换为日期

    database - Azure 数据工厂 - Dynamics 365 复制数据复制 GUID 值而不显示值

    mysql - 通过 SSIS,无法使用连接成功且位于 ODBC.INI 注册表中的用户 DSN 连接到 mysql DB

    reporting-services - SSRS 找不到安装了 SQL Server 的服务器