所以我有 SQL,如果检测到重复项,它会向字段添加代码。还有一个字段叫 DS
DS 可以是"is",也可以是“否”
如果发现重复项,我怎样才能做到"is"不编码而“否”编码?
本质上"is"优先。
我的 SQL:
WITH cte
AS (SELECT *,
Row_Number() OVER(partition BY fips_county_code, last, suffix, first, birthdate Order by (select null)) AS Rn
FROM [PULLED REC])
UPDATE cte
SET BAD_CODES = Isnull(BAD_CODES, '') + 'D'
WHERE RN > 1;
最佳答案
要仅更新 ds='No'
的行,您可以将其添加到 where
子句中。
为了确保 rn > 1
不会跳过需要更新的重复项之一,您可以使用 exists()
替代 count( )
with cte as (
select
*
, rn = row_number() over (
partition by fips_county_code, last, suffix, first, birthdate
order by (case when DS = 'yes' then 0 else 1 end) asc
)
from [pulled rec]
)
/* -- check with select first -- */
select * from cte
/*
update cte set
bad_codes = isnull(bad_codes, '') + 'D'
--*/
/* -- Update all records that have a duplicate
-- except the First row, ordered by ds='Yes' first */
/*
where cte.ds = 'No'
and cte.rn > 1
--*/
-- Update all records that have a duplicate and ds='No' --
--/*
where cte.ds = 'No'
and exists (
select 1
from cte as i
where i.rn > 1
and i.fips_county_code = cte.fips_county_code
and i.last = cte.last
and i.suffix = cte.suffix
and i.first = cte.first
and i.birthdate = cte.birthdate
);
--*/
使用 count() over()
的替代版本:
with cte as (
select
*
, CountOver = count() over (
partition by fips_county_code, last, suffix, first, birthdate
)
from [pulled rec]
)
/* -- check with select first -- */
select * from cte
/*
update cte set
bad_codes = isnull(bad_codes, '') + 'D'
--*/
where cte.ds = 'No'
and cte.CountOver > 1
关于sql - 如何从文件中删除重复项,同时某些字段优先?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41810402/