SQLite 查询的运行速度比 MS Access 查询慢 10 倍

标签 sqlite query-optimization

我有一个 800MB 的 MS Access 数据库,已迁移到 SQLite。数据库结构如下(SQLite数据库迁移后约330MB):

Occurrence 有 1,600,000 条记录。该表如下所示:

CREATE TABLE Occurrence 
(
SimulationID  INTEGER,    SimRunID   INTEGER,    OccurrenceID   INTEGER,
OccurrenceTypeID    INTEGER,    Period    INTEGER,    HasSucceeded    BOOL, 
PRIMARY KEY (SimulationID,  SimRunID,   OccurrenceID)
)

它有以下索引:

CREATE INDEX "Occurrence_HasSucceeded_idx" ON "Occurrence" ("HasSucceeded" ASC)

CREATE INDEX "Occurrence_OccurrenceID_idx" ON "Occurrence" ("OccurrenceID" ASC)

CREATE INDEX "Occurrence_SimRunID_idx" ON "Occurrence" ("SimRunID" ASC)

CREATE INDEX "Occurrence_SimulationID_idx" ON "Occurrence" ("SimulationID" ASC)

OccurrenceParticipant 有 3,400,000 条记录。该表如下所示:

CREATE TABLE OccurrenceParticipant 
(
SimulationID    INTEGER,     SimRunID    INTEGER,    OccurrenceID     INTEGER,
RoleTypeID     INTEGER,     ParticipantID    INTEGER
)

它有以下索引:

CREATE INDEX "OccurrenceParticipant_OccurrenceID_idx" ON "OccurrenceParticipant" ("OccurrenceID" ASC)

CREATE INDEX "OccurrenceParticipant_ParticipantID_idx" ON "OccurrenceParticipant" ("ParticipantID" ASC)

CREATE INDEX "OccurrenceParticipant_RoleType_idx" ON "OccurrenceParticipant" ("RoleTypeID" ASC)

CREATE INDEX "OccurrenceParticipant_SimRunID_idx" ON "OccurrenceParticipant" ("SimRunID" ASC)

CREATE INDEX "OccurrenceParticipant_SimulationID_idx" ON "OccurrenceParticipant" ("SimulationID" ASC)

InitialParticipant 有 130 条记录。表的结构为

CREATE TABLE InitialParticipant 
(
ParticipantID    INTEGER  PRIMARY KEY,     ParticipantTypeID    INTEGER,
ParticipantGroupID     INTEGER
)

该表具有以下索引:

CREATE INDEX "initialpart_participantTypeID_idx" ON "InitialParticipant" ("ParticipantGroupID" ASC)

CREATE INDEX "initialpart_ParticipantID_idx" ON "InitialParticipant" ("ParticipantID" ASC)

ParticipantGroup 有 22 条记录。看起来像

CREATE TABLE ParticipantGroup   (
ParticipantGroupID    INTEGER,    ParticipantGroupTypeID     INTEGER,
Description    varchar (50),      PRIMARY KEY(  ParticipantGroupID  )
)

该表有以下索引: 在“ParticipantGroup”(“ParticipantGroupID”ASC)上创建索引“ParticipantGroup_ParticipantGroupID_idx”

tmpSimArgs有18条记录。它具有以下结构:

CREATE TABLE tmpSimArgs (SimulationID varchar, SimRunID int(10))

以及以下索引:

CREATE INDEX tmpSimArgs_SimRunID_idx ON tmpSimArgs(SimRunID ASC)

CREATE INDEX tmpSimArgs_SimulationID_idx ON tmpSimArgs(SimulationID ASC)

表“tmpPartArgs”有 80 条记录。它具有以下结构:

CREATE TABLE tmpPartArgs(participantID INT)

以及以下索引:

CREATE INDEX tmpPartArgs_participantID_idx ON tmpPartArgs(participantID ASC)

我有一个涉及多个 INNER JOIN 的查询,我面临的问题是该查询的 Access 版本大约需要一秒钟,而同一查询的 SQLite 版本需要 10 秒(大约慢 10 倍!)这是不可能的对我来说,迁移回 Access 和 SQLite 是我唯一的选择。

我是编写数据库查询的新手,因此这些查询可能看起来很愚蠢,所以请就您发现的任何错误或幼稚的内容提出建议。

Access中的查询是(整个查询执行需要1秒):

SELECT ParticipantGroup.Description, Occurrence.SimulationID, Occurrence.SimRunID, Occurrence.Period, Count(OccurrenceParticipant.ParticipantID) AS CountOfParticipantID FROM 
( 
   ParticipantGroup INNER JOIN InitialParticipant ON ParticipantGroup.ParticipantGroupID =  InitialParticipant.ParticipantGroupID
) INNER JOIN 
(
tmpPartArgs INNER JOIN 
  (
     (
        tmpSimArgs INNER JOIN Occurrence ON (tmpSimArgs.SimRunID = Occurrence.SimRunID)   AND (tmpSimArgs.SimulationID = Occurrence.SimulationID)
     ) INNER JOIN OccurrenceParticipant ON (Occurrence.OccurrenceID =    OccurrenceParticipant.OccurrenceID) AND (Occurrence.SimRunID = OccurrenceParticipant.SimRunID) AND (Occurrence.SimulationID = OccurrenceParticipant.SimulationID)
  ) ON tmpPartArgs.participantID = OccurrenceParticipant.ParticipantID
) ON InitialParticipant.ParticipantID = OccurrenceParticipant.ParticipantID WHERE (((OccurrenceParticipant.RoleTypeID)=52 Or (OccurrenceParticipant.RoleTypeID)=49)) AND Occurrence.HasSucceeded = True GROUP BY ParticipantGroup.Description, Occurrence.SimulationID, Occurrence.SimRunID, Occurrence.Period;

SQLite查询如下(该查询大约需要10秒):

SELECT ij1.Description, ij2.occSimulationID, ij2.occSimRunID, ij2.Period, Count(ij2.occpParticipantID) AS CountOfParticipantID FROM 
(
   SELECT ip.ParticipantGroupID AS ipParticipantGroupID, ip.ParticipantID AS ipParticipantID, ip.ParticipantTypeID, pg.ParticipantGroupID AS pgParticipantGroupID, pg.ParticipantGroupTypeID, pg.Description FROM ParticipantGroup as pg INNER JOIN InitialParticipant AS ip ON pg.ParticipantGroupID = ip.ParticipantGroupID
) AS ij1 INNER JOIN 
(
   SELECT tpa.participantID AS tpaParticipantID, ij3.* FROM tmpPartArgs AS tpa INNER JOIN 
     (
       SELECT ij4.*, occp.SimulationID as occpSimulationID, occp.SimRunID AS occpSimRunID, occp.OccurrenceID AS occpOccurrenceID, occp.ParticipantID AS occpParticipantID, occp.RoleTypeID FROM 
          (
              SELECT tsa.SimulationID AS tsaSimulationID, tsa.SimRunID AS tsaSimRunID, occ.SimulationID AS occSimulationID, occ.SimRunID AS occSimRunID, occ.OccurrenceID AS occOccurrenceID, occ.OccurrenceTypeID, occ.Period, occ.HasSucceeded FROM tmpSimArgs AS tsa INNER JOIN Occurrence AS occ ON (tsa.SimRunID = occ.SimRunID) AND (tsa.SimulationID = occ.SimulationID)
          ) AS ij4 INNER JOIN OccurrenceParticipant AS occp ON (occOccurrenceID =      occpOccurrenceID) AND (occSimRunID = occpSimRunID) AND (occSimulationID = occpSimulationID)
    ) AS ij3 ON tpa.participantID = ij3.occpParticipantID
) AS ij2 ON ij1.ipParticipantID = ij2.occpParticipantID WHERE (((ij2.RoleTypeID)=52 Or (ij2.RoleTypeID)=49)) AND ij2.HasSucceeded = 1 GROUP BY ij1.Description, ij2.occSimulationID, ij2.occSimRunID, ij2.Period;   

我不知道我在这里做错了什么。我拥有所有索引,但我认为我缺少声明一些对我有用的关键索引。有趣的是,在迁移之前,我对 SQLite 的“研究”表明,SQLite 在各方面都比 Access 更快、更小、更好。但在查询方面,我似乎无法让 SQLite 比 Access 更快地工作。我重申,我是 SQLite 新手,显然没有太多想法和经验,因此如果任何博学的人可以帮助我解决这个问题,我将不胜感激。

最佳答案

我已经重新格式化了您的代码(使用我的自制 sql formatter ),希望能让其他人更容易阅读。

重新格式化的查询:

SELECT
    ij1.Description,
    ij2.occSimulationID,
    ij2.occSimRunID,
    ij2.Period,
    Count(ij2.occpParticipantID) AS CountOfParticipantID

FROM (

    SELECT
        ip.ParticipantGroupID AS ipParticipantGroupID,
        ip.ParticipantID AS ipParticipantID,
        ip.ParticipantTypeID,
        pg.ParticipantGroupID AS pgParticipantGroupID,
        pg.ParticipantGroupTypeID,
        pg.Description

    FROM ParticipantGroup AS pg

    INNER JOIN InitialParticipant AS ip
            ON pg.ParticipantGroupID = ip.ParticipantGroupID

) AS ij1

INNER JOIN (

    SELECT
        tpa.participantID AS tpaParticipantID,
        ij3.*

    FROM tmpPartArgs AS tpa

    INNER JOIN (

        SELECT
            ij4.*,
            occp.SimulationID AS occpSimulationID,
            occp.SimRunID AS occpSimRunID,
            occp.OccurrenceID AS occpOccurrenceID,
            occp.ParticipantID AS occpParticipantID,
            occp.RoleTypeID

        FROM (

            SELECT
                tsa.SimulationID AS tsaSimulationID,
                tsa.SimRunID AS tsaSimRunID,
                occ.SimulationID AS occSimulationID,
                occ.SimRunID AS occSimRunID,
                occ.OccurrenceID AS occOccurrenceID,
                occ.OccurrenceTypeID,
                occ.Period,
                occ.HasSucceeded

            FROM tmpSimArgs AS tsa

            INNER JOIN Occurrence AS occ
                    ON (tsa.SimRunID = occ.SimRunID)
                   AND (tsa.SimulationID = occ.SimulationID)

        ) AS ij4

        INNER JOIN OccurrenceParticipant AS occp
                ON (occOccurrenceID = occpOccurrenceID)
               AND (occSimRunID = occpSimRunID)
               AND (occSimulationID = occpSimulationID)

    ) AS ij3
      ON tpa.participantID = ij3.occpParticipantID

) AS ij2
  ON ij1.ipParticipantID = ij2.occpParticipantID

WHERE (

    (

        (ij2.RoleTypeID) = 52
        OR
        (ij2.RoleTypeID) = 49

    )

)
  AND ij2.HasSucceeded = 1

GROUP BY
    ij1.Description,
    ij2.occSimulationID,
    ij2.occSimRunID,
    ij2.Period;

根据 JohnFx(见上文),我对派生 View 感到困惑。我认为实际上没有必要,特别是因为它们都是内部联接。因此,下面我尝试降低复杂性。请检查并测试性能。我必须与 tmpSimArgs 进行交叉连接,因为它仅连接到 Occurence - 我认为这是所需的行为。

SELECT
    pg.Description,
    occ.SimulationID,
    occ.SimRunID,
    occ.Period,
    COUNT(occp.ParticipantID) AS CountOfParticipantID

FROM ParticipantGroup AS pg

INNER JOIN InitialParticipant AS ip
        ON pg.ParticipantGroupID = ip.ParticipantGroupID

CROSS JOIN tmpSimArgs AS tsa

INNER JOIN Occurrence AS occ
        ON tsa.SimRunID = occ.SimRunID
       AND tsa.SimulationID = occ.SimulationID

INNER JOIN OccurrenceParticipant AS occp
        ON occ.OccurrenceID = occp.OccurrenceID
       AND occ.SimRunID = occp.SimRunID
       AND occ.SimulationID = occp.SimulationID

INNER JOIN tmpPartArgs AS tpa
        ON tpa.participantID = occp.ParticipantID

WHERE occ.HasSucceeded = 1
  AND (occp.RoleTypeID = 52 OR occp.RoleTypeID = 49 )

GROUP BY
    pg.Description,
    occ.SimulationID,
    occ.SimRunID,
    occ.Period;

关于SQLite 查询的运行速度比 MS Access 查询慢 10 倍,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/2017114/

相关文章:

java - 如何更新 Sqlite (Android) 中的空列

python - 在python中使用变量创建表

sql - 我可以根据 SQLite 中的查询插入不同的数据库吗?

Mysql 或性能问题

php - 喜欢一些MySql Optimization techniques for Bulk data table

mysql - 使用 WHERE 和 GROUP BY 进行查询的最有效索引?

Python int 太大而无法放入 SQLite

针对 SQLite 查询结果的 C# IF 语句

mysql - MySQL 与内部选择连接速度较慢

sql - 优化平均值 SQL 查询的平均值