sql - 优化在 Oracle 上运行缓慢而在 SQL Server 上运行快速的 SELECT 查询

我尝试在 Oracle 中运行以下 SQL 语句，但运行时间较长:

SELECT orderID FROM tasks WHERE orderID NOT IN 
(SELECT DISTINCT orderID FROM tasks WHERE
 engineer1 IS NOT NULL AND engineer2 IS NOT NULL)

如果我只运行 IN 子句中的子部分，它在 Oracle 中运行得非常快，即

SELECT DISTINCT orderID FROM tasks WHERE
engineer1 IS NOT NULL AND engineer2 IS NOT NULL

为什么Oracle中整个语句需要这么长时间？在 SQL Server 中，整个语句运行得很快。

或者我应该使用更简单/不同/更好的 SQL 语句吗？

有关该问题的更多详细信息:

每个订单都由许多任务组成
每个订单都将被分配(其一个或多个任务将设置工程师 1 和工程师 2)，或者订单可以取消分配(其所有任务的工程师字段都为空值)
我正在尝试查找所有未分配的 orderID。

以防万一，表中有大约 12 万行，每个订单有 3 个任务，因此大约有 4 万个不同的订单。

对答案的回应:

我更喜欢同时适用于 SQL Server 和 Oracle 的 SQL 语句。
任务仅具有订单 ID 和任务 ID 的索引。
我尝试了该语句的 NOT EXISTS 版本，但它运行了 3 分钟以上，然后我才取消它。也许需要 JOIN 版本的语句？
还有一个“订单”表以及 orderID 列。但我试图通过不将其包含在原始 SQL 语句中来简化问题。

我猜想在原始 SQL 语句中，子查询每次都会针对 SQL 语句第一部分中的每一行运行 - 即使它是静态的并且只需要运行一次？

执行中

ANALYZE TABLE tasks COMPUTE STATISTICS;

使我原来的 SQL 语句执行得更快。

尽管我仍然好奇为什么我必须这样做，以及是否/何时需要再次运行它？

The statistics give Oracle's cost-based optimzer information that it needs to determine the efficiency of different execution plans: for example, the number of rowsin a table, the average width of rows, highest and lowest values per column, number of distinct values per column, clustering factor of indexes etc.

In a small database you can just setup a job to gather statistics every night and leave it alone. In fact, this is the default under 10g. For larger implementations you usually have to weigh the stability of the execution plans against the way that the data changes, which is a tricky balance.

Oracle also has a feature called "dynamic sampling" that is used to sample tables to determine relevant statistics at execution time. It's much more often used with data warehouses where the overhead of the sampling it outweighed by the potential performance increase for a long-running query.

最佳答案

如果您分析所涉及的表，通常这种类型的问题就会消失(这样 Oracle 就可以更好地了解数据的分布)

ANALYZE TABLE tasks COMPUTE STATISTICS;

关于sql - 优化在 Oracle 上运行缓慢而在 SQL Server 上运行快速的 SELECT 查询，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/120504/

sql - 优化在 Oracle 上运行缓慢而在 SQL Server 上运行快速的 SELECT 查询

上一篇：sql-server - 所有 SQL Server 版本都会自动重建索引还是有默认的重建标准？

下一篇：Django SelectDateWidget 仅显示月份和年份