c# - SQL Server 匹配单词短语和顺序相关性的最佳方法

标签 c# sql-server sql-server-2008 sorting dataset

通过参数中单词的数量(计数)/匹配对 sql varchar 列进行排名的最佳方法是什么,有四个不同的唯一标准。这可能不是一个微不足道的问题,但我面临着使用我的标准根据“最佳匹配”对行进行排序的挑战。

列:描述 varchar(100)
参数:@MyParameter varchar(100)

具有此顺序首选项的输出:

  • 完全匹配(整个字符串匹配) - 总是第一个
  • 开头(根据匹配的参数长度递减)
  • 对于相同的匹配词数,连续词的词数排名更高
  • 单词匹配任何地方(不连续)

  • 单词可能不完全匹配,因为一个单词的部分匹配是允许的,并且很可能,lessor 值应该应用于部分单词以进行排名但不是关键(pot 将匹配每个:pot、potter、potholder、depot、depotting 例如)。以其他单词开头的匹配应该比没有后续匹配的匹配排名更高,但这不是交易杀手/ super 重要。

    我想要一种方法来对列“开始于”参数中的值进行排名。假设我有以下字符串:
    'This is my value string as a test template to rank on.'
    

    在第一种情况下,我希望拥有最多单词数的列/行的排名。

    第二个根据开始时的出现(最佳匹配)排名为:
    'This is my string as a test template to rank on.' - first
    'This is my string as a test template to rank on even though not exact.'-second
    'This is my string as a test template to rank' - third
    'This is my string as a test template to' - next
    'This is my string as a test template' - next etc.
    

    其次:(可能是第一个(以开头)之后的第二个数据集/组 - 这是需要的

    我想根据@MyParameter 中出现的@MyParameter 中的单词数对行进行排名(排序),其中连续单词的排名高于相同的计数分隔。

    因此对于上面的示例字符串,'is my string as shown'将排名高于 'is not my other string as'由于具有相同单词数的连续字符串(单词在一起)的“更好匹配”。具有更高匹配度(出现的单词数)的行将按降序排列最佳匹配。

    如果可能,我想在单个查询中执行此操作。

    结果中不应出现两次行。

    出于性能考虑,表中不会出现超过 10,000 行。

    表中的值相当静态,几乎没有变化,但并非完全如此。

    我目前无法更改结构,但稍后会考虑(如单词/短语表)

    为了使这稍微复杂一点,单词列表在两个表中 - 但我可以为此创建一个 View ,但是在给定相同匹配的情况下,一个表结果(较小的列表)应该在第二个较大的数据集结果之前出现 - 会有从这些表以及表中重复,我只想要不同的值。选择 DISTINCT 并不容易,因为我想返回一列(sourceTable),这很可能会使行变得不同,在这种情况下只能从第一个(较小的)表中选择,但需要所有其他列 DISTINCT(不要考虑列在“不同”的评价中。

    表中的伪列:
    procedureCode   VARCHAR(50),
    description VARCHAR(100), -- this is the sort/evaluation column
    category    VARCHAR(50),
    relvu       VARCHAR(50),
    charge  VARCHAR(15),
    active  bit
    sourceTable   VARCHAR(50) - just shows which table it comes from of the two
    

    不存在像 ID 列那样的唯一索引

    匹配不在要排除的第三个表中 SELECT * FROM (select * from tableone where procedureCode not in (select procedureCode from tablethree)) UNION ALL (select * from tabletwo where procedureCode not in (select procedureCode from tablethree))
    编辑:为了解决这个问题,我创建了一个表值参数,如下所示:
    0       Gastric Intubation & Aspiration/Lavage, Treatmen
    1       Gastric%Intubation%Aspiration%Lavage%Treatmen
    2       Gastric%Intubation%Aspiration%Lavage
    3       Gastric%Intubation%Aspiration
    4       Gastric%Intubation
    5       Gastric
    6       Intubation%Aspiration%Lavage%Treatmen
    7       Intubation%Aspiration%Lavage
    8       Intubation%Aspiration
    9       Intubation
    10      Aspiration%Lavage%Treatmen
    11      Aspiration%Lavage
    12      Aspiration
    13      Lavage%Treatmen
    14      Lavage
    15      Treatmen
    

    实际短语在第 0 行

    这是我目前的尝试:
    CREATE PROCEDURE [GetProcedureByDescription]
    (   
            @IncludeMaster  BIT,
            @ProcedureSearchPhrases CPTFavorite READONLY
    
    )
    AS
    
        DECLARE @myIncludeMaster    BIT;
    
        SET @myIncludeMaster = @IncludeMaster;
    
        CREATE TABLE #DistinctMatchingCpts
        (
        procedureCode   VARCHAR(50),
        description     VARCHAR(100),
        category        VARCHAR(50),
        rvu     VARCHAR(50),
        charge      VARCHAR(15),
        active      VARCHAR(15),
        sourceTable   VARCHAR(50),
        sequenceSet VARCHAR(2)
        )
    
        IF @myIncludeMaster = 0
            BEGIN -- Excluding master from search   
              INSERT INTO #DistinctMatchingCpts (sourceTable, procedureCode, description    ,   category  ,charge, active, rvu, sequenceSet
    ) 
          SELECT DISTINCT sourceTable, procedureCode, description, category ,charge, active, rvu, sequenceSet
              FROM (
                      SELECT TOP 1
                          LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                          LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                          LTRIM(RTRIM(CPT.[COMBO])) AS category,
                          LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                          ''True'' AS active,
                          LTRIM(RTRIM([RVU])) AS rvu,
                          ''0CPTMore'' AS sourceTable,
                          ''01'' AS sequenceSet
                      FROM 
                        @ProcedureSearchPhrases PP
                        INNER JOIN  [CPTMORE] AS CPT
                          ON CPT.[LEVEL] = PP.[LEVEL]
                      WHERE 
                          (CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
                          AND CPT.[CODE] IS NOT NULL
                          AND CPT.[CODE] NOT IN (''0'', '''')
                        AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
                      ORDER BY PP.CODE
    
              UNION ALL
    
                      SELECT 
                          LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                          LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                          LTRIM(RTRIM(CPT.[COMBO])) AS category,
                          LTRIM(RTRIM([CHARGE])) AS charge,
                          ''True'' AS active,
                          LTRIM(RTRIM([RVU])) AS rvu,
                          ''0CPTMore'' AS sourceTable, 
                          ''02'' AS sequenceSet
                      FROM 
                        @ProcedureSearchPhrases PP
                        INNER JOIN  [CPTMORE] AS CPT
                          ON CPT.[LEVEL] LIKE PP.[LEVEL] + ''%''
                      WHERE 
                          (CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
                          AND CPT.[CODE] IS NOT NULL
                          AND CPT.[CODE] NOT IN (''0'', '''')
                        AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
    
              UNION ALL
    
                SELECT 
                          LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                          LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                          LTRIM(RTRIM(CPT.[COMBO])) AS category,
                          LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                          ''True'' AS active,
                          LTRIM(RTRIM([RVU])) AS rvu,
                          ''0CPTMore'' AS sourceTable,
                          ''03'' AS sequenceSet
                      FROM 
                        @ProcedureSearchPhrases PP
                        INNER JOIN  [CPTMORE] AS CPT
                          ON CPT.[LEVEL] LIKE ''%'' + PP.[LEVEL] + ''%''
                      WHERE 
                          (CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
                          AND CPT.[CODE] IS NOT NULL
                          AND CPT.[CODE] NOT IN (''0'', '''')
                        AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
    
                ) AS CPTS
                ORDER BY 
                     procedureCode, sourceTable, [description]
            END -- Excluded master from search
        ELSE
            BEGIN -- Including master in search, but present favorites before master for each code
                -- Get matching procedures, ordered by code, source (favorites first), and description.
                -- There probably will be procedures with duplicated code+description, so we will filter
                -- duplicates shortly.
          INSERT INTO #DistinctMatchingCpts (sourceTable, procedureCode, description    ,   category  ,charge, active, rvu, sequenceSet) 
          SELECT DISTINCT sourceTable, procedureCode, description, category ,charge, active, rvu, sequenceSet
              FROM (
                      SELECT TOP 1
                          LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                          LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                          LTRIM(RTRIM(CPT.[COMBO])) AS category,
                          LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                          ''True'' AS active,
                          LTRIM(RTRIM([RVU])) AS rvu,
                          ''0CPTMore'' AS sourceTable,
                          ''00'' AS sequenceSet
                    FROM 
                        @ProcedureSearchPhrases PP
                        INNER JOIN  [CPTMORE] AS CPT
                          ON CPT.[LEVEL] = PP.[LEVEL]
                      WHERE 
                          (CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
                          AND CPT.[CODE] IS NOT NULL
                          AND CPT.[CODE] NOT IN (''0'', '''')
                        AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
                      ORDER BY PP.CODE
    
                      UNION ALL
    
                      SELECT TOP 1
                          LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                          LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                          LTRIM(RTRIM(CPT.[CATEGORY])) AS category,
                          LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                          COALESCE(CASE [ACTIVE] WHEN 1 THEN ''True'' WHEN 0 THEN ''False'' WHEN '''' THEN ''False'' ELSE ''False'' END,''True'') AS active,
                          LTRIM(RTRIM([RVU])) AS rvu,
                          ''2MasterCPT'' AS sourceTable,
                          ''00'' AS sequenceSet
                      FROM 
                        @ProcedureSearchPhrases PP
                        INNER JOIN  [MASTERCPT] AS CPT
                          ON CPT.[LEVEL] = PP.[LEVEL]
                      WHERE 
                          CPT.[CODE] IS NOT NULL
                          AND CPT.[CODE] NOT IN (''0'', '''')
                        AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
                      ORDER BY PP.CODE
    
                      UNION ALL
    
                      SELECT 
                          LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                          LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                          LTRIM(RTRIM(CPT.[COMBO])) AS category,
                          LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                          ''True'' AS active,
                          LTRIM(RTRIM([RVU])) AS rvu,
                          ''0CPTMore'' AS sourceTable,
                          ''01'' AS sequenceSet
                    FROM 
                        @ProcedureSearchPhrases PP
                        INNER JOIN  [CPTMORE] AS CPT
                          ON CPT.[LEVEL] = PP.[LEVEL]
                      WHERE 
                          (CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
                          AND CPT.[CODE] IS NOT NULL
                          AND CPT.[CODE] NOT IN (''0'', '''')
                        AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
    
                      UNION ALL
    
                      SELECT 
                          LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                          LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                          LTRIM(RTRIM(CPT.[CATEGORY])) AS category,
                          LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                          COALESCE(CASE [ACTIVE] WHEN 1 THEN ''True'' WHEN 0 THEN ''False'' WHEN '''' THEN ''False'' ELSE ''False'' END,''True'') AS active,
                          LTRIM(RTRIM([RVU])) AS rvu,
                          ''2MasterCPT'' AS sourceTable,
                          ''01'' AS sequenceSet
                      FROM 
                        @ProcedureSearchPhrases PP
                        INNER JOIN  [MASTERCPT] AS CPT
                          ON CPT.[LEVEL] = PP.[LEVEL]
                      WHERE 
                          CPT.[CODE] IS NOT NULL
                          AND CPT.[CODE] NOT IN (''0'', '''')
                        AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
    
                      UNION ALL
    
                      SELECT TOP 1
                          LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                          LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                          LTRIM(RTRIM(CPT.[COMBO])) AS category,
                          LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                          ''True'' AS active,
                          LTRIM(RTRIM([RVU])) AS rvu,
                          ''0CPTMore'' AS sourceTable,
                          ''02'' AS sequenceSet
                    FROM 
                        @ProcedureSearchPhrases PP
                        INNER JOIN  [CPTMORE] AS CPT
                          ON CPT.[LEVEL] LIKE PP.[LEVEL] + ''%''
                      WHERE 
                          (CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
                          AND CPT.[CODE] IS NOT NULL
                          AND CPT.[CODE] NOT IN (''0'', '''')
                        AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
                      ORDER BY PP.CODE
    
                      UNION ALL
    
                      SELECT TOP 1
                          LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                          LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                          LTRIM(RTRIM(CPT.[CATEGORY])) AS category,
                          LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                          COALESCE(CASE [ACTIVE] WHEN 1 THEN ''True'' WHEN 0 THEN ''False'' WHEN '''' THEN ''False'' ELSE ''False'' END,''True'') AS active,
                          LTRIM(RTRIM([RVU])) AS rvu,
                          ''2MasterCPT'' AS sourceTable,
                          ''02'' AS sequenceSet
                      FROM 
                        @ProcedureSearchPhrases PP
                        INNER JOIN  [MASTERCPT] AS CPT
                          ON CPT.[LEVEL] LIKE PP.[LEVEL] + ''%''
                      WHERE 
                          CPT.[CODE] IS NOT NULL
                          AND CPT.[CODE] NOT IN (''0'', '''')
                        AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
                      ORDER BY PP.CODE
    
                      UNION ALL
    
                      SELECT 
                          LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                          LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                          LTRIM(RTRIM(CPT.[COMBO])) AS category,
                          LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                          ''True'' AS active,
                          LTRIM(RTRIM([RVU])) AS rvu,
                          ''0CPTMore'' AS sourceTable,
                          ''03'' AS sequenceSet
                    FROM 
                        @ProcedureSearchPhrases PP
                        INNER JOIN  [CPTMORE] AS CPT
                          ON CPT.[LEVEL] LIKE PP.[LEVEL] + ''%''
                      WHERE 
                          (CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
                          AND CPT.[CODE] IS NOT NULL
                          AND CPT.[CODE] NOT IN (''0'', '''')
                        AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
    
                      UNION ALL
    
                      SELECT 
                          LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                          LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                          LTRIM(RTRIM(CPT.[CATEGORY])) AS category,
                          LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                          COALESCE(CASE [ACTIVE] WHEN 1 THEN ''True'' WHEN 0 THEN ''False'' WHEN '''' THEN ''False'' ELSE ''False'' END,''True'') AS active,
                          LTRIM(RTRIM([RVU])) AS rvu,
                          ''2MasterCPT'' AS sourceTable,
                          ''03'' AS sequenceSet
                      FROM 
                        @ProcedureSearchPhrases PP
                        INNER JOIN  [MASTERCPT] AS CPT
                          ON CPT.[LEVEL] LIKE PP.[LEVEL] + ''%''
                      WHERE 
                          CPT.[CODE] IS NOT NULL
                          AND CPT.[CODE] NOT IN (''0'', '''')
                        AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
    
                      UNION ALL
    
                      SELECT 
                          LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                          LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                          LTRIM(RTRIM(CPT.[COMBO])) AS category,
                          LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                          ''True'' AS active,
                          LTRIM(RTRIM([RVU])) AS rvu,
                          ''0CPTMore'' AS sourceTable,
                          ''04'' AS sequenceSet
                    FROM 
                        @ProcedureSearchPhrases PP
                        INNER JOIN  [CPTMORE] AS CPT
                          ON CPT.[LEVEL] LIKE ''%'' + PP.[LEVEL] + ''%''
                      WHERE 
                          (CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
                          AND CPT.[CODE] IS NOT NULL
                          AND CPT.[CODE] NOT IN (''0'', '''')
                        AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
    
                      UNION ALL
    
                      SELECT 
                          LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                          LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                          LTRIM(RTRIM(CPT.[CATEGORY])) AS category,
                          LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                          COALESCE(CASE [ACTIVE] WHEN 1 THEN ''True'' WHEN 0 THEN ''False'' WHEN '''' THEN ''False'' ELSE ''False'' END,''True'') AS active,
                          LTRIM(RTRIM([RVU])) AS rvu,
                          ''2MasterCPT'' AS sourceTable,
                          ''04'' AS sequenceSet
                      FROM 
                        @ProcedureSearchPhrases PP
                        INNER JOIN  [MASTERCPT] AS CPT
                          ON CPT.[LEVEL] LIKE ''%'' + PP.[LEVEL] + ''%''
                      WHERE 
                          CPT.[CODE] IS NOT NULL
                          AND CPT.[CODE] NOT IN (''0'', '''')
                        AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
    
                 ) AS CPTS 
    
                ORDER BY 
                     sequenceSet, sourceTable, [description]
    
            END
    
            /* Final select - uses artificial ordering from the insertion ORDER BY */
            SELECT procedureCode, description,  category, rvu, charge, active FROM
            ( 
            SELECT TOP 500 *-- procedureCode, description,  category, rvu, charge, active
            FROM #DistinctMatchingCpts
            ORDER BY sequenceSet, sourceTable, description
    
            ) AS CPTROWS
    
            DROP TABLE #DistinctMatchingCpts
    

    但是,这不符合单词计数的最佳匹配标准(如示例中的第 1 行值),它应该匹配从该行找到的最佳(最多)单词计数。

    如果这有所不同,我可以完全控制表值参数的形式/格式。

    如果有用,我会将这个结果返回给 c# 程序。

    最佳答案

    您需要能够拆分字符串来解决此问题。 I prefer the number table approach to split a string in TSQL

    为了让我下面的代码正常工作(以及我的拆分功能),您需要进行一次时间表设置:

    SELECT TOP 10000 IDENTITY(int,1,1) AS Number
        INTO Numbers
        FROM sys.objects s1
        CROSS JOIN sys.objects s2
    ALTER TABLE Numbers ADD CONSTRAINT PK_Numbers PRIMARY KEY CLUSTERED (Number)
    

    设置 Numbers 表后,创建此拆分函数:
    CREATE FUNCTION [dbo].[FN_ListToTable]
    (
         @SplitOn  char(1)      --REQUIRED, the character to split the @List string on
        ,@List     varchar(8000)--REQUIRED, the list to split apart
    )
    RETURNS TABLE
    AS
    RETURN 
    (
    
        ----------------
        --SINGLE QUERY-- --this will not return empty rows
        ----------------
        SELECT
            ListValue
            FROM (SELECT
                      LTRIM(RTRIM(SUBSTRING(List2, number+1, CHARINDEX(@SplitOn, List2, number+1)-number - 1))) AS ListValue
                      FROM (
                               SELECT @SplitOn + @List + @SplitOn AS List2
                           ) AS dt
                          INNER JOIN Numbers n ON n.Number < LEN(dt.List2)
                      WHERE SUBSTRING(List2, number, 1) = @SplitOn
                 ) dt2
            WHERE ListValue IS NOT NULL AND ListValue!=''
    
    );
    GO 
    

    随意制作您自己的拆分函数,但您仍然需要 Numbers 表才能使我的解决方案起作用。

    您现在可以轻松地将 CSV 字符串拆分为一个表并在其上加入:
    select * from dbo.FN_ListToTable(',','1,2,3,,,4,5,6777,,,')
    

    输出:
    ListValue
    -----------------------
    1
    2
    3
    4
    5
    6777
    
    (6 row(s) affected)
    

    现在试试这个:
    DECLARE @BaseTable table (RowID int primary key, RowValue varchar(100))
    set nocount on
    INSERT @BaseTable VALUES ( 1,'The cows came home empty handed')
    INSERT @BaseTable VALUES ( 2,'This is my string as a test template to rank')                           -- third
    INSERT @BaseTable VALUES ( 3,'pencil pen paperclip eraser')
    INSERT @BaseTable VALUES ( 4,'wow')
    INSERT @BaseTable VALUES ( 5,'no dice here')
    INSERT @BaseTable VALUES ( 6,'This is my string as a test template to rank on even though not exact.') -- second
    INSERT @BaseTable VALUES ( 7,'apple banana pear grape lemon orange kiwi strawberry peach watermellon')
    INSERT @BaseTable VALUES ( 8,'This is my string as a test template')                                   -- 5th
    INSERT @BaseTable VALUES ( 9,'rat cat bat mat sat fat hat pat ')
    INSERT @BaseTable VALUES (10,'house home pool roll')
    INSERT @BaseTable VALUES (11,'This is my string as a test template to')                                -- 4th
    INSERT @BaseTable VALUES (12,'talk wisper yell scream sing hum')
    INSERT @BaseTable VALUES (13,'This is my string as a test template to rank on.')                       -- first
    INSERT @BaseTable VALUES (14,'aaa bbb ccc ddd eee fff ggg hhh')
    INSERT @BaseTable VALUES (15,'three twice three once twice three')
    set nocount off
    
    DECLARE @SearchValue varchar(100)
    SET @SearchValue='This is my value string as a test template to rank on.'
    
    ;WITH SplitBaseTable AS --expand each @BaseTable row into one row per word
    (SELECT
         b.RowID, b.RowValue, s.ListValue
         FROM @BaseTable b
             CROSS APPLY  dbo.FN_ListToTable(' ',b.RowValue) AS s
    )
    , WordMatchCount AS --for each @BaseTable row that has has a word in common withe the search string, get the count of matching words
    (SELECT
         s.RowID,COUNT(*) AS CountOfWordMatch
         FROM dbo.FN_ListToTable(' ',@SearchValue) v
             INNER JOIN SplitBaseTable             s ON v.ListValue=s.ListValue
         GROUP BY s.RowID
         HAVING COUNT(*)>0
    )
    , SearchLen AS --get one row for each possible length of the search string
    (
    SELECT
        n.Number,SUBSTRING(@SearchValue,1,n.Number) AS PartialSearchValue
        FROM Numbers n
        WHERE n.Number<=LEN(@SearchValue)
    )
    , MatchLen AS --for each @BaseTable row, get the max starting length that matches the search string
    (
     SELECT
         b.RowID,MAX(l.Number) MatchStartLen
         FROM @BaseTable                 b
             LEFT OUTER JOIN SearchLen   l ON LEFT(b.RowValue,l.Number)=l.PartialSearchValue
         GROUP BY b.RowID
    )
    SELECT --return the final search results
        b.RowValue,w.CountOfWordMatch,m.MatchStartLen
        FROM @BaseTable                     b
            LEFT OUTER JOIN WordMatchCount  w ON b.RowID=w.RowID
            LEFT OUTER JOIN MatchLen        m ON b.RowID=m.RowID
        WHERE w.CountOfWordMatch>0
        ORDER BY w.CountOfWordMatch DESC,m.MatchStartLen DESC,LEN(b.RowValue) DESC,b.RowValue ASC
    

    输出:
    RowValue                                                                CountOfWordMatch MatchStartLen
    ----------------------------------------------------------------------- ---------------- -------------
    This is my string as a test template to rank on.                        11               11
    This is my string as a test template to rank on even though not exact.  10               11
    This is my string as a test template to rank                            10               11
    This is my string as a test template to                                 9                11
    This is my string as a test template                                    8                11
    
    (5 row(s) affected)
    

    它对字符串单词匹配的开头做一些不同的处理,因为它查看匹配字符串开头的字符数。

    一旦你开始工作,你可以尝试通过为 SplitBaseTable 创建一些静态索引表来优化它。可能在@BaseTable 上使用触发器。

    关于c# - SQL Server 匹配单词短语和顺序相关性的最佳方法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/6493589/

    相关文章:

    c# - Entity Framework 4 : Bad performance with SQL Server 2008

    c# - 适用于 Windows Phone 8.1 的 Facebook 登录

    c# - 如何为 mscorlib 创建外部别名

    c# - 对于新手,首先开发的最佳 Web 应用程序是什么?

    mysql - 用于只读操作的快速类 SQL 数据库

    大表上的 SQL WHERE -> 先加入小表还是直接在 WHERE 子句中放 FK?

    c# - 在最后一个字符前插入一个点

    sql - 获取用户友好的 SQL Server 产品名称

    SQL:通过连接将一个表中的列链接到另一个表中的不同列

    sql - 如何将 'group' docmd.runsql SQL (MS ACCESS) 写入事务并允许在失败时回滚