sql - 这是列出素数的好算法吗?

标签 sql sql-server algorithm tsql

DECLARE @c int = 1000;
DECLARE @numbers  TABLE (n int NOT NULL PRIMARY KEY);
DECLARE @products TABLE (p int NOT NULL PRIMARY KEY);
DECLARE @primes   TABLE (p int NOT NULL PRIMARY KEY);

-- The 'composite exclusion' approach

-- 1. list all n = 2, 3, 4, ... c
WITH numbers AS
(
    SELECT  2 AS n
    UNION ALL
    SELECT n + 1 FROM numbers
    WHERE   n <= @c - 1
)
INSERT INTO @numbers SELECT n FROM numbers OPTION(MAXRECURSION 0);

-- 2. find all products n x n <= c
WITH products AS
(
    SELECT  DISTINCT m.n * n.n AS p
    FROM    @numbers m LEFT OUTER JOIN
            @numbers n ON 1 = 1
    WHERE   m.n * n.n <= @c
)
INSERT INTO @products SELECT p FROM products;

-- 3. numbers with no matching products are not composite, i.e, they're prime numbers.
INSERT INTO @primes
SELECT n.n FROM @numbers n LEFT JOIN @products p ON n.n = p.p WHERE p.p IS NULL;

这是一种一次性的埃拉托色尼筛法。

我见过循环、存储过程等,以及伪代码和其他语言实现,但在我看来,这种源自素数定义的简单的、基于集合的方法应该足够了。

请记住,此时我不关心性能或内存消耗或优化,并且我没有用更大的数字进行测试。我只想发布算法并让人们确认(或挑战)从列表中排除合数就足够了。

最佳答案

递归 CTE (rCTE) 很少是性能最佳的解决方案。下面是一种使用理货表的方法,它是 Hugo Kornelis 在此处发布的方法的略微调整版本:https://sqlserverfast.com/blog/hugo/2006/09/the-prime-number-challenge-great-waste-of-time/

让我们比较计数表解决方案和 rCTE 解决方案:

SET STATISTICS TIME ON;

PRINT 'tally table approach'+char(13)+char(10)+replicate('-',50);
DECLARE @primes   TABLE (p int NOT NULL PRIMARY KEY);
DECLARE @limit bigint = 10000;

WITH E(x) AS (SELECT * FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t(x)),
iTally(N) AS (SELECT TOP(@limit) ROW_NUMBER() OVER (ORDER BY (SELECT 1)) FROM E a, E b, E c, E d, E f)
INSERT @primes
SELECT      n1.N
FROM        itally AS n1
WHERE       n1.N > 1
AND         n1.N < @Limit
AND NOT EXISTS
 (SELECT    *
  FROM      itally AS n2
  WHERE     n2.N < @limit
  AND       n2.N BETWEEN 2 AND n1.N-1
  AND       n1.n % n2.N = 0)
--ORDER BY N
GO

PRINT 'rCTE approach'+char(13)+char(10)+replicate('-',50);
DECLARE @c int = 10000;
DECLARE @numbers  TABLE (n int NOT NULL PRIMARY KEY);
DECLARE @products TABLE (p int NOT NULL PRIMARY KEY);
DECLARE @primes   TABLE (p int NOT NULL PRIMARY KEY);

WITH numbers AS
(
    SELECT  2 AS n
    UNION ALL
    SELECT n + 1 FROM numbers
    WHERE   n <= @c - 1
)
INSERT INTO @numbers SELECT n FROM numbers OPTION(MAXRECURSION 0);

-- 2. find all products n x n <= c
WITH products AS
(
    SELECT  DISTINCT m.n * n.n AS p
    FROM    @numbers m LEFT OUTER JOIN
            @numbers n ON 1 = 1
    WHERE   m.n * n.n <= @c
)
INSERT INTO @products SELECT p FROM products;

-- 3. numbers with no matching products are not composite, i.e, they're prime numbers.
INSERT INTO @primes
SELECT n.n FROM @numbers n LEFT JOIN @products p ON n.n = p.p WHERE p.p IS NULL;

SET STATISTICS TIME OFF;

结果:

tally table approach
--------------------------------------------------

 SQL Server Execution Times:
   CPU time = 3042 ms,  elapsed time = 3241 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 10 ms.

rCTE approach
--------------------------------------------------

 SQL Server Execution Times:
   CPU time = 14976 ms,  elapsed time = 15757 ms.

如您所见,计数表方法针对 10,000 的速度快了 5 倍,而且也不会产生任何读取(rCTE 产生大量数据!)

如果您真的在处理素数,绝对最快的方法是将它们存储在一个表中,这样您就不需要在每次需要素数时都计算它们。

关于sql - 这是列出素数的好算法吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40642383/

相关文章:

php - SQL 棘手的顺序

algorithm - 按此向量中映射的成员对对象向量进行排序

algorithm - 销售排名算法

mysql - 将 mysql 查询转换为 sql server 2000 的 mssql 查询

c - 使用递归的算法

mysql - 试图获取名字和姓氏

mysql - 用于比较表中较早值的 SQL 查询

sql - LEFT JOIN 形式之间的差异

sql-server - 如何在SQL函数中插入插入语句?

sql - 显示未知索引名称的死锁图