我有一个可通过 oracle 数据库链接访问的表,由于原因,我试图将其拉入本地数据库表。
MERGE INTO MEMBERSHIPS LOCAL
USING (
SELECT DISTINCT
REMOTE.GROUP_NAME "GROUP_NAME",
REMOTE.USER_ACCOUNT "USERNAME",
REMOTE.SOME_OTHER_COLUMN "COL3"
FROM MEMBERSHIPS@link REMOTE
) REMOTE
ON (
REMOTE.GROUP_NAME = LOCAL.GROUP_NAME AND
REMOTE.USERNAME = LOCAL.USERNAME
)
WHEN MATCHED THEN
UPDATE SET
LOCAL.COL3 = REMOTE.COL3
LOCAL.UPDATED_AT = sysdate
WHEN NOT MATCHED THEN
INSERT (ID, GROUP_NAME, USERNAME, COl3, CREATED_AT, UPDATED_AT)
VALUES (MEMBERSHIPS_SEQ.NEXTVAL, REMOTE.GROUP_NAME, REMOTE.USERNAME, REMOTE.COl3, sysdate, sysdate)
最不幸的是,原始数据库的所有者并没有因为担心数据完整性而失眠,所以在 300 万行中,有 71 个重复项,这炸毁了我对 Group Name, Username 的唯一索引。如果我删除唯一性约束,合并将处理,但是这些行将在后续执行查询时爆炸
ORA-30926: unable to get a stable set of rows in the source tables
.这是一种每天都会运行的事情,所以我需要找到一种方法来忽略重复项
编辑:
我本以为不同的人会为我解决这个问题,但事实并非如此。我仍然得到重复:
SELECT DISTINCT
REMOTE.GROUP_NAME,
REMOTE.USER_ACCOUNT
COUNT(*)
FROM MEMBERSHIPS@link REMOTE
GROUP BY
REMOTE.GROUP_NAME,
REMOTE.USER_ACCOUNT
HAVING COUNT(*) > 1;
显示 71 个仍重复的 GROUP_NAME/USER_ACCOUNT 组合
最佳答案
在类似的情况下,您总是可以尝试对行进行排名以避免重复,而不是 DISTINCT。
在这种情况下,它会是这样的:
MERGE INTO MEMBERSHIPS LOCAL
USING (
SELECT rank() over(partition by REMOTE.GROUP_NAME
,REMOTE.USER_ACCOUNT
order by NVL(REMOTE.UPDATED_AT,REMOTE.CREATED_AT) DESC NULLS LAST) r,
REMOTE.GROUP_NAME "GROUP_NAME",
REMOTE.USER_ACCOUNT "USERNAME",
REMOTE.SOME_OTHER_COLUMN "COL3"`
FROM MEMBERSHIPS@link REMOTE ) REMOTE
ON (
REMOTE.GROUP_NAME = LOCAL.GROUP_NAME
AND REMOTE.USERNAME = LOCAL.USERNAME
AND REMOTE.r = 1
)
WHEN MATCHED THEN
UPDATE SET
LOCAL.COL3 = REMOTE.COL3
LOCAL.UPDATED_AT = sysdate
WHEN NOT MATCHED THEN
INSERT (ID, GROUP_NAME, USERNAME, COl3, CREATED_AT, UPDATED_AT)
VALUES (MEMBERSHIPS_SEQ.NEXTVAL, REMOTE.GROUP_NAME, REMOTE.USERNAME, REMOTE.COl3, sysdate, sysdate);
关于Oracle Merge 忽略重复项,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30086405/