1 分钟查询与表值函数中的相同查询之间的执行时间存在巨大差异。
但最奇怪的事情是使用另一个(有效的)company_id 参数运行 UDF 在大约 40 秒内给我一个结果,一旦我将这个 company_id 更改为 12(再次有效),它就永远不会停止。 这两个查询的执行计划绝对不一样,当然长查询是最复杂的。但是批处理版本和 UDF 版本之间的执行计划是相同的,并且批处理版本速度很快...!
如果我“手动”执行以下查询,则执行时间为 1 分钟 36 秒,包含 306 行:
SELECT
dbo.date_only(Call.date) AS date,
count(DISTINCT customer_id) AS new_customers
FROM
Call
LEFT OUTER JOIN
dbo.company_new_customers(12, 2009, 2009) new_customers
ON dbo.date_only(new_customers.date) = dbo.date_only(Call.date)
WHERE
company_id = 12
AND year(Call.date) >= 2009
AND year(Call.date) <= 2009
GROUP BY
dbo.date_only(Call.date)
我将这个完全相同的查询存储在一个函数中并像这样运行它:
SELECT * FROM company_new_customers_count(12, 2009, 2009)
现在它正在运行 13 分钟...我确信它永远不会给我任何结果。
昨天,我在 4 个多小时内出现了完全相同的无限循环行为(所以我停止了它)。
这是函数的定义:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION company_new_customers_count
(
@company_id int,
@start_year int,
@end_year int
)
RETURNS TABLE
AS
RETURN
(
SELECT
dbo.date_only(Call.date) AS date,
count(DISTINCT customer_id) AS new_customers
FROM
Call
LEFT OUTER JOIN
dbo.company_new_customers(@company_id, @start_year, @end_year) new_customers
ON dbo.date_only(new_customers.date) = dbo.date_only(Call.date)
WHERE
company_id = @company_id
AND year(Call.date) >= @start_year
AND year(Call.date) <= @end_year
GROUP BY
dbo.date_only(Call.date)
)
GO
我很高兴了解正在发生的事情。
谢谢
附加:
company_new_customers 的定义:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Description: Create the list of new customers of @company_id
-- in the given period.
-- =============================================
CREATE FUNCTION company_new_customers
(
@company_id int,
@start_year int,
@end_year int
)
RETURNS TABLE
AS
RETURN
(
SELECT
customer_id,
date
FROM
( -- select apparition dates of cutomers before @end_year
SELECT
min(date) AS date,
customer_id
FROM
Call
JOIN
Call_Customer ON Call_Customer.call_id = Call.call_id
WHERE
company_id = @company_id
AND year(date) <= @end_year
GROUP BY
customer_id
) new_customers
WHERE
year(date) >= @start_year -- select apparition dates of cutomers after @start_year
)
GO
date_only 的定义:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author: Julio Guerra
-- Create date: 14/10/2010
-- Description: Return only the date part of a datetime value
-- Example: date_only('2010-10-25 13:00:12') returns 2010-10-25
-- =============================================
CREATE FUNCTION date_only
(
@datetime datetime
)
RETURNS datetime
AS
BEGIN
RETURN dateadd(dd, 0, datediff(dd, 0, @datetime))
END
GO
SELECT * FROM company_new_customers_count(8, 2009, 2009) 的执行计划
SELECT * FROM company_new_customers_count(12, 2009, 2009) 的执行计划
最佳答案
从这些查询计划看来,您可以从这样的索引中受益(如果我正确推断出您的数据库架构):
CREATE INDEX IX_call_company_date ON call (company_id, date)
总的来说,这似乎是一个标准的查询优化问题,表值函数实际上并没有产生什么影响。
关于SQL Server 2005 表值函数奇怪的性能,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/4190506/