regex - 设计 : running pg_dump when tables are continuously created and dropped

标签 regex postgresql pg-dump

我们在 the Kappa architecture 的变体中将 PostgreSQL (v9.5) 作为服务数据库运行:

  • 计算作业的每个实例都会创建并填充自己的结果表,例如“t_jobResult_instanceId”。
  • 作业完成后,其输出表可供访问。同一作业类型的多个结果表可能同时使用。
  • 当不需要输出表时,将其删除。

计算结果并不是这个数据库实例中唯一的一种表,我们需要定期进行热备份。这就是我们的问题。当表来来去去时,pg_dump 就死了。这是一个重现我们的故障模式的简单测试(它涉及 2 个 session ,S1 和 S2):

S1 : psql -U postgres -d myuser

create table t1 ( a int );
begin transaction;
drop table t1;

S2 : pg_dump -Fc -v -U postgres -d myuser -f /tmp/rs.dump

S1 : commit;

Session S2 now shows the following error:

pg_dump -Fc -U postgres -d myuser -f /tmp/rs.dump
pg_dump: [archiver (db)] query failed: ERROR: relation "public.t1" does not exist
pg_dump: [archiver (db)] query was: LOCK TABLE public.t1 IN ACCESS SHARE MODE

我们想到了几个解决方案,但我们都不喜欢其中任何一个:

  1. 将所有结果表放入一个单独的模式中,并从备份中排除该模式。我们喜欢这种简单性,但这种方法打破了模块化:我们的数据库对象按垂直切片分组到模式中。
  2. 编写在备份期间暂停表删除的应用程序代码。我们想知道是否有更简单的解决方案。

我们喜欢以下想法,但无法实现:

  1. 我们的结果表遵循命名约定。我们可以写一个正则表达式来判断一个表名是否指向一个结果表。理想情况下,我们将能够运行 pg_dump,参数指示它跳过匹配此模式的表(请注意,选择要在备份开始时排除的表是不够好的,因为在 pg_dump 运行时可能会创建和删除新的结果表).这要么是不可能的,要么是我们不够聪明,无法弄清楚如何做到这一点。

对不起,冗长的背景,但现在我终于到了问题:

  • 有没有办法实现我们错过的 3.?
  • 有更好的想法吗?

最佳答案

这应该可以使用 -T pg_dump 的选项:

-T <strong><em>table</em></strong>
--exclude-table=<strong><em>table</em></strong>
   Do not dump any tables matching the <strong><em>table</em></strong> pattern.

psql 文档包含有关这些模式的详细信息:

Within a pattern, * matches any sequence of characters (including no characters) and ? matches any single character. (This notation is comparable to Unix shell file name patterns.) For example, \dt int* displays tables whose names begin with int. But within double quotes, * and ? lose these special meanings and are just matched literally.

A pattern that contains a dot (.) is interpreted as a schema name pattern followed by an object name pattern. For example, \dt foo*.*bar* displays all tables whose table name includes bar that are in schemas whose schema name starts with foo. When no dot appears, then the pattern matches only objects that are visible in the current schema search path. Again, a dot within double quotes loses its special meaning and is matched literally.

Advanced users can use regular-expression notations such as character classes, for example [0-9] to match any digit. All regular expression special characters work as specified in Section 9.7.3, except for . which is taken as a separator as mentioned above, * which is translated to the regular-expression notation .*, ? which is translated to ., and $ which is matched literally. You can emulate these pattern characters at need by writing ? for ., (R+|) for R*, or (R|) for R?. $ is not needed as a regular-expression character since the pattern must match the whole name, unlike the usual interpretation of regular expressions (in other words, $ is automatically appended to your pattern). Write * at the beginning and/or end if you don't wish the pattern to be anchored. Note that within double quotes, all regular expression special characters lose their special meanings and are matched literally.

关于regex - 设计 : running pg_dump when tables are continuously created and dropped,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54319721/

相关文章:

正则表达式捕获重复组中不需要的字符

mysql - 当外键指向其他内容时,也移动其他数据

node.js - 如何在sequelize中关联jsonb类型字段和表?

postgresql - 如何在psycopg2中使用lobject函数

PostgreSQL : ERROR: relation "sequence" does not exist while restoring from dump file

javascript - 如何验证电子邮件地址与网站域名是否匹配?

c++ - BOOST 正则表达式全局搜索行为

java - 使用 Java 正则表达式进行模式匹配

postgresql - pg_dump 中的 "interesting tables"是什么

postgresql - . 时如何将密码传递给 pg_dump 10主目录中的 pgpass 不是一个选项?