我们在 the Kappa architecture 的变体中将 PostgreSQL (v9.5) 作为服务数据库运行:
- 计算作业的每个实例都会创建并填充自己的结果表,例如“t_jobResult_instanceId”。
- 作业完成后,其输出表可供访问。同一作业类型的多个结果表可能同时使用。
- 当不需要输出表时,将其删除。
计算结果并不是这个数据库实例中唯一的一种表,我们需要定期进行热备份。这就是我们的问题。当表来来去去时,pg_dump 就死了。这是一个重现我们的故障模式的简单测试(它涉及 2 个 session ,S1 和 S2):
S1 : psql -U postgres -d myuser
create table t1 ( a int );
begin transaction;
drop table t1;
S2 : pg_dump -Fc -v -U postgres -d myuser -f /tmp/rs.dump
S1 : commit;
Session S2 now shows the following error:
pg_dump -Fc -U postgres -d myuser -f /tmp/rs.dump
pg_dump: [archiver (db)] query failed: ERROR: relation "public.t1" does not exist
pg_dump: [archiver (db)] query was: LOCK TABLE public.t1 IN ACCESS SHARE MODE
我们想到了几个解决方案,但我们都不喜欢其中任何一个:
- 将所有结果表放入一个单独的模式中,并从备份中排除该模式。我们喜欢这种简单性,但这种方法打破了模块化:我们的数据库对象按垂直切片分组到模式中。
- 编写在备份期间暂停表删除的应用程序代码。我们想知道是否有更简单的解决方案。
我们喜欢以下想法,但无法实现:
- 我们的结果表遵循命名约定。我们可以写一个正则表达式来判断一个表名是否指向一个结果表。理想情况下,我们将能够运行 pg_dump,参数指示它跳过匹配此模式的表(请注意,选择要在备份开始时排除的表是不够好的,因为在 pg_dump 运行时可能会创建和删除新的结果表).这要么是不可能的,要么是我们不够聪明,无法弄清楚如何做到这一点。
对不起,冗长的背景,但现在我终于到了问题:
- 有没有办法实现我们错过的 3.?
- 有更好的想法吗?
最佳答案
这应该可以使用 -T
pg_dump 的选项:
-T <strong><em>table</em></strong>
--exclude-table=<strong><em>table</em></strong>
Do not dump any tables matching the<strong><em>table</em></strong>
pattern.
psql
文档包含有关这些模式的详细信息:
Within a pattern,
*
matches any sequence of characters (including no characters) and?
matches any single character. (This notation is comparable to Unix shell file name patterns.) For example,\dt int*
displays tables whose names begin withint
. But within double quotes,*
and?
lose these special meanings and are just matched literally.A pattern that contains a dot (
.
) is interpreted as a schema name pattern followed by an object name pattern. For example,\dt foo*.*bar*
displays all tables whose table name includesbar
that are in schemas whose schema name starts withfoo
. When no dot appears, then the pattern matches only objects that are visible in the current schema search path. Again, a dot within double quotes loses its special meaning and is matched literally.Advanced users can use regular-expression notations such as character classes, for example
[0-9]
to match any digit. All regular expression special characters work as specified in Section 9.7.3, except for.
which is taken as a separator as mentioned above,*
which is translated to the regular-expression notation.*
,?
which is translated to.
, and$
which is matched literally. You can emulate these pattern characters at need by writing?
for.
,(R+|)
forR*
, or(R|)
forR?
.$
is not needed as a regular-expression character since the pattern must match the whole name, unlike the usual interpretation of regular expressions (in other words,$
is automatically appended to your pattern). Write*
at the beginning and/or end if you don't wish the pattern to be anchored. Note that within double quotes, all regular expression special characters lose their special meanings and are matched literally.
关于regex - 设计 : running pg_dump when tables are continuously created and dropped,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54319721/