我有大量的 Linux 服务器需要维护。我经常需要在所有这些设备上运行脚本 (script.sh) 来获取健康状态,该脚本通常需要大约 30-40 秒才能给出输出。为了方便维护任务,我正在编写一个 shell 脚本,它使用 SSH 循环访问所有远程主机、运行 script.sh、收集输出并将其写入本地主机中的日志文件。为了这个问题,我将此脚本命名为 MyScript.sh
该脚本工作正常,但是,它必须等待 SSH 输出才能继续到下一个主机。由于我的服务器太多,并且命令按顺序运行,因此需要几分钟才能完成。我想并行循环所有服务器,而不需要等待每个主机的响应。
有没有办法可以使用 MyScript.sh 在所有主机上同时远程运行 script.sh?也许在后台运行 ssh 命令并以某种方式收集输出?
script.sh 的输出是由管道分隔的单行。比如下面的
host1|49 days|10%|3.77%|27677/63997 MB|43% - /usr|38% - /usr|Optimal|No|40%|No
Myscript.sh 的输出是所有不带管道的主机的输出的串联。
Date Hostname Uptime CPU I/O Free MEM File System INODES STATUS WWW YYY ZZZ XXX
===================================================================================================================================================================================================
01/31/20 host1 44 days 5% 10.33% 38083/64000 MB 57% - / 37% - /usr OPTIMAL No 40% No
01/31/20 host2 45 days 11% 1.79% 27915/63997 MB 43% - /usr 38% - /usr OPTIMAL UP 7% OK
01/31/20 host3 45 days 2% 1.89% 32145/63997 MB 43% - /usr 38% - /usr OPTIMAL UP NO OK
01/31/20 host4 45 days 11% 3.72% 52477/128637 MB 49% - /var 38% - /usr OPTIMAL UP 8% OK
01/31/20 host5 45 days 6% 3.21% 65264/128637 MB 46% - /var 38% - /usr OPTIMAL UP NO OK
01/31/20 host6 45 days 7% 5.79% 56369/63997 MB 43% - /usr 38% - /usr OPTIMAL UP NO No
01/31/20 host7 45 days 6% 1.66% 56391/63997 MB 43% - /var 38% - /usr OPTIMAL UP NO No
MyScript.sh的核心如下:
(
for ip in $IP_LIST;
do
echo "Checking $ip"
ssh -q -t $user@$ip 'sudo /tmp/script.sh' > /tmp/$$
current_date=$(date +%D)
printf "%-10s " "$current_date" >> $logfile
while read line;
do
echo $line | awk -F '|' '{printf("%-10s %-10s %-7s %-8s %-18s %-25s %-25s %-15s %-15s %-25s %-10s\n",$1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11); }' >> $logfile
done< /tmp/$$
done
)
总之,我想优化这个脚本以在多个服务器上同时运行上述代码。 谢谢!
最佳答案
解决方案可能是部署具有自定义检查功能的监控软件。
对于并行 ssh
问题,无需安装任何二进制文件,您可以使用我不久前编写的这个脚本。
放入文件 mssh
,运行 chmod u+x mssh
然后:
./mssh -s SERVER1 -s SERVER2 -C script.sh
mssh
文件:
#!/usr/bin/env bash
readonly prog_name="$(basename "$0")"
readonly date="$(date +%Y%m%d_%H%M%S)"
# print help
usage() {
cat <<- EOF
usage: $prog_name options
parallel ssh executions.
OPTIONS:
-c --cmd CMD execute command CMD
-s --host SRV execute cmd on server SRV
-C --cmd CMD_FILE execute command contained in CMD_FILE
-S --hosts-file SRV_FILE execute cmd on all servers contained in SRV_FILE
-h --help show this help
Examples:
Run CMD on SERVER1 and SERVER2:
./$prog_name -s SERVER1 -s SERVER2 -c "CMD"
EOF
}
# test if an element is in an array
is_element(){
local search=$1; shift;
for e in "$@"; do [[ "$e" == "$search" ]] && return 0; done
return 1
}
# parse arguments
for arg in "$@"; do
case "$arg" in
--help) args+=( -h );;
--host) args+=( -s );;
--hosts-file) args+=( -S );;
--cmd) args+=( -c );;
--cmd-file) args+=( -C );;
*) args+=("$arg");;
esac
done
set -- "${args[@]}"
while getopts "hs:S:c:C:" OPTION; do
case $OPTION in
h) usage; exit 0;;
s) servers_array+=("$OPTARG");;
S) while read -r L; do servers_array+=("$L"); done < <( grep -vE "^ *(#|$)" "$OPTARG");;
c) cmd="$OPTARG";;
C) cmd="$(< "$OPTARG")"; file=$OPTARG;;
*) :;;
esac
done
if [[ -z ${servers_array[0]} ]] || [[ -z $cmd ]]; then
usage; exit 1
fi
# clean up created files at exit
trap "rm -f /tmp/pssh*$date" EXIT
[[ -n $file ]] && echo "executing command file : $file" || echo "executing command : $cmd"
# run cmd on each server
for i in "${!servers_array[@]}"; do
# executing cmd in subshell
ssh -n "${servers_array[$i]}" "$cmd" > "/tmp/pssh_${i}_${servers_array[$i]}_${date}" 2>&1 &
pid=$!
pids_array+=("$pid")
echo "${servers_array[$i]} - $pid"
done
# for each pid, set state to running
ps_state_array=( $(for i in "${!servers_array[@]}"; do echo "running"; done) )
echo "waiting for results..."
echo
# begin finished verifications
continue=true; attempt=0
while $continue; do
# foreach ps
for i in "${!pids_array[@]}"; do
# if already finished skip
[[ ${ps_state_array[$i]} == "finished" ]] && continue
# else check if finished
ps -o pid "${pids_array[$i]}" > /dev/null 2>&1 && ps_finished=false || ps_finished=true
if $ps_finished; then
ps_state_array[$i]="finished"
echo -e "[ ${servers_array[$i]} @ $(date +%H:%M:%S) ]" | grep '.*' --color=always
cat "/tmp/pssh_${i}_${servers_array[$i]}_${date}"
rm -f "/tmp/pssh_${i}_${servers_array[$i]}_${date}"
echo
fi
done
is_element "running" "${ps_state_array[@]}" || continue=false
if $continue; then
(( attempt < 5 )) && attempt=$(( attempt + 1 ))
sleep $attempt
fi
done
exit 0
关于linux - 如何在多台主机上同时运行远程脚本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60065192/