mysql - 如何使用自增整数主键合并多个文件？

如果要将表与单独的文件连接，如何在表上设置有效的自动递增整数主键？我每天都会收到这样的数据:

交互数据:

Date | PersonID | DateTime | CustomerID | Other values...

那里的主键是 PersonID + DateTime + CustomerID。如果我有一个整数键，我怎样才能让它关联回另一个表？我想知道特定人员与特定客户交互的行，这样我就可以将这些数据片段绑定(bind)到一个主文件中。

调查返回数据:

Date | PersonID | DateTime | CustomerID | Other values...

我通常先在 pandas 中处理所有原始数据，然后再将其加载到数据库中。其他一些文件也没有日期时间戳，只有日期。一个人在同一天与同一位客户进行交互的情况很少见，因此我通常会删除所有存在重复的行(所有实例)，因此我的连接样本完全是独一无二的。

其他数据:

Date | PersonID | CustomerID | Other values...

我无法想象如何设置它，所以我知道“交互数据”表中的第 56,547 行与“调查返回数据”表中的第 10,982 行匹配。或者我应该继续按照我使用三列复合键的方式进行操作吗？

最佳答案

(我假设为 postgresql，因为您已将此帖子标记为垃圾邮件；由您来翻译其他数据库系统)。

听起来您正在使用复杂的自然键加载数据，例如 (PersonID,DateTime,CustomerID) 并且您不想在相关表中使用自然键，可能是为了存储空间原因。

如果是这样，对于您的辅助表，您可能希望CREATE UNLOGGED TABLE 一个匹配原始输入数据的表。 COPY 将数据复制到该表中。然后在最终目标表中执行 INSERT INTO ... SELECT ...，使用自然键映射连接表。

例如，在您的情况下，您有表 interaction:

CREATE TABLE interaction (
    interaction_id serial primary key,
    "PersonID" integer
    "DateTime" timestamp,
    "CustomerID" integer,
    UNIQUE("PersonID", "DateTime", "CustomerID"),
    ...
);

对于表 survey_return 只是对 interaction_id 的引用:

CREATE TABLE survey_return (
    survey_return_id serial primary key,
    interaction_id integer not null foreign key references interaction(interaction_id),
    col1 integer, -- data cols
    ..
);

现在创建:

CREATE UNLOGGED TABLE survey_return_load (
    "PersonID" integer
    "DateTime" timestamp,
    "CustomerID" integer,
    PRIMARY KEY ("PersonID","DateTime", "CustomerID")
    col1 integer, -- data cols
    ...
);

并将数据COPY 到其中，然后执行INSERT INTO ... SELECT ... 以针对交互 加入加载的数据> 表并使用派生的 interaction_id 而不是原始自然键插入结果:

INSERT INTO survey_return
SELECT interaction_id, col1, ...
FROM survey_return_load l
   LEFT JOIN interaction i ON ( (i."PersonID", i."DateTime", i."CustomerID") = (l."PersonID", l."DateTime", l."CustomerID") );

如果输入调查返回中存在未出现在 interaction 表中的自然键元组，则此操作将失败并出现 null 违规。

关于mysql - 如何使用自增整数主键合并多个文件？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/31065388/

mysql - 如何使用自增整数主键合并多个文件？

上一篇：postgresql - postgis拓扑中的CreateTopology()中的SRID

下一篇：sql - 如何对每一行 PostgreSQL 使用计数