python - 有时找不到 Linux 可执行文件

标签 python linux gdal

问题: 如何解决批处理脚本期间有时会消失的可执行文件?

背景: 我有一系列脚本,旨在使用 mbgrid 和 gmt surface 等工具解析 xyz 高程数据、划分为网格、合并和插值这些数据。这些可以手动调用,但通常使用指定网格大小、单元格分辨率和目录结构等配置文件作为计划任务运行。

这些脚本是用 python 编写的,使用多处理并通过子进程进行系统调用。

问题: 有时,我会遇到找不到命令的情况 (例如 cat 180E48S_mbgrid_output.log /bin/sh: 1: mbgrid: 未找到)

由于这些脚本是通过带有配置文件的 cron 调用的,因此我希望得到一致的结果。大多数时候(而且似乎总是在我观看时,即使我只是运行脚本 cron 启动),代码成功运行直至完成。但有时不是。

下面是一个日志文件片段,显示未找到 mbgrid(粗体),然后成功运行。然后在另一次成功运行之前未找到 mbgrid 的另一个实例。

grep '/bin/sh: 1: mbgrid: not found' 22E59N_mbgrid_output.log -B 10 -A 10

/bin/sh: 1: mbgrid: not found

Zgrid starting iterations Zgrid iteration 10 convergence test: -0.001000 last:0.000000 Zgrid iteration 20 convergence test: -0.001000 last:-0.001000 Zgrid iteration 30 convergence test: -0.001000 last:-0.001000 Zgrid iteration 40 convergence test: -0.001000 last:-0.001000 Zgrid iteration 50 convergence test: -0.001000 last:-0.001000 Zgrid iteration 60 convergence test: -0.001000 last:-0.001000 Zgrid iteration 70 convergence test: -0.001000 last:-0.001000 Zgrid iteration 80 convergence test: -0.001000 last:-0.001000 Zgrid iteration 90 convergence test: -0.001000 last:-0.001000

... 截断的日志...

mbm_grdplot -I22E59Nweightedgrid.grd -G1 -C -V -L"File 22E59Nweightedgrid.grd - Topography Grid:Topography (m)"

executing mbm_grdplot... mbm_grdplot -I22E59Nweightedgrid_num.grd -G1 -W1/2 -V -L"File 22E59Nweightedgrid_num.grd - Topography Grid:Number of Topography Data Points"

executing mbm_grdplot... mbm_grdplot -I22E59Nweightedgrid_sd.grd -G1 -W1/2 -V -L"File 22E59Nweightedgrid_sd.grd - Topography Grid:Topography Standard Deviation (m)"

Done.

/bin/sh: 1: mbgrid: not found

Zgrid starting iterations Zgrid iteration 10 convergence test: -0.001000 last:0.000000 Zgrid iteration 20 convergence test: -0.001000 last:-0.001000 Zgrid iteration 30 convergence test: -0.001000 last:-0.001000 Zgrid iteration 40 convergence test: -0.001000 last:-0.001000 Zgrid iteration 50 convergence test: -0.001000 last:-0.001000 Zgrid iteration 60 convergence test: -0.001000 last:-0.001000 Zgrid iteration 70 convergence test: -0.001000 last:-0.001000 Zgrid iteration 80 convergence test: -0.001000 last:-0.001000 Zgrid iteration 90 convergence test: -0.001000 last:-0.001000

...截断日志...

采取的步骤: 当我第一次遇到这个时,我将它与一些特别大且密集的网格联系起来。我很快看到我有 10 个进程,每个进程都需要 20% 的内存。减少进程数似乎可以解决这个问题。

但是,我已经看到这种情况发生,即使它看起来不像是内存压力问题,但我可能在系统监控中遗漏了一些东西。

我设置了 sar 监控,每隔几分钟获取一次系统统计数据。我还设置了一个 cron,每 5 分钟获取一些最重要的统计数据。我真的没有看到这些会导致 linux 资源匮乏以至于无法找到可执行文件的任何内容。我会把它们放在底部,因为这已经很长了。

我已经尝试添加一些日志记录诊断来发现问题。

这是检查 mbgrid 是否存在的代码片段,如果找到,则记录正在运行的确切命令:

 if (os.path.exists(r'/usr/local/bin/mbgrid') and os.path.isfile(r'/usr/local/bin/mbgrid')):
        try:
            logging.info("mbgrid -I"+gridcellname+"mbdatalist.mbf -O"+ \
                            weightedname+" -R"+str(wbound)+"/"+str(ebound)+"/"+str(sbound)+"/"+str(nbound)+ \
                            " -A2 -P1 -V -F1 -C1 -N -M -E"+str(cellsize)+"/0.0/degrees! >> " + gridcellname + \
                            "_mbgrid_output.log 2>&1")
            subprocess.call("mbgrid -I"+gridcellname+"mbdatalist.mbf -O"+ \
                            weightedname+" -R"+str(wbound)+"/"+str(ebound)+"/"+str(sbound)+"/"+str(nbound)+ \
                            " -A2 -P1 -V -F1 -C1 -N -M -E"+str(cellsize)+"/0.0/degrees! >> " + gridcellname + \
                            "_mbgrid_output.log 2>&1",cwd=griddir,shell=True)
            logging.info("mbgrid complete")
        except:
            logging.error("Error creating weighted grid for "+gridcellname)
            sys.exit("Error creating weighted grid for "+gridcellname)
    else:
        logging.error("Unable to find the mbgrid executable at /usr/local/bin/mbgrid for " + gridcellname)

来自给定配置文件的主日志: otto@Tendua:~/automated/NZ_TopoBathy_30m$ grep '180E48S_mbgrid' NZ_TopoBathy_30m_grids.log -A 3 -B 3

INFO:09/08/2017 21:35:06:clipping and reformatting grid for 162E50S

INFO:09/08/2017 21:35:06:simplifying 165E34S_ngdc.xyz by converting to grid and back to xyz

INFO:09/08/2017 21:35:06:run mbgrid using the datalist to create weighted grid

INFO:09/08/2017 21:35:06:mbgrid -I180E48Smbdatalist.mbf -O180E48Sweightedgrid -R179.7/181.3/-48.3/-46.7 -A2 -P1 -V -F1 -C1 -N -M -E0.000277777777778/0.0/degrees! >> 180E48S_mbgrid_output.log 2>&1

INFO:09/08/2017 21:35:06:grid converted to netcdf3 format

INFO:09/08/2017 21:35:06:run mbgrid using the datalist to create weighted grid

INFO:09/08/2017 21:35:06:mbgrid -I175E30Smbdatalist.mbf -O175E30Sweightedgrid -R174.7/176.3/-30.3/-28.7 -A2 -P1 -V -F1 -C1 -N -M -E0.000277777777778/0.0/degrees! >> 175E30S_mbgrid_output.log 2>&1

在那个确切网格的日志文件中:

cat 180E48S_mbgrid_output.log

/bin/sh: 1: mbgrid: not found

当我在正确的目录中发出确切的命令时,它会按预期运行。

我还应该寻找什么来识别和解决这个问题?

非常感谢您的宝贵时间。

============================================= ======

前十个进程,每 5 分钟运行一次(它还跟踪数据驱动器上的磁盘空间): otto@Tendua:~$ cat top_ten_process_monitor.sh

#!/bin/sh
date
free -m | awk 'NR==2{printf "Memory Usage: %s/%sMB (%.2f%%)\n", $3,$2,$3*100/$2 }'
df -h | awk '$0 ~ "tendua" {printf "Disk Usage: %1.1f/%1.1fTB (%s)\n", $3,$2,$5}'
top -bn1 | grep load | awk '{printf "CPU Load: %.2f\n", $(NF-2)}'
ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%mem | head

(日志部分来自上述事件的时间)

Fri Sep  8 21:26:01 MDT 2017
Memory Usage: 63865/64239MB (99.42%)
Disk Usage: 2.7/3.6TB (74%)
Disk Usage: 2.5/3.6TB (68%)
Disk Usage: 2.3/3.6TB (64%)
Disk Usage: 4.3/5.5TB (83%)
CPU Load: 1.57
  PID  PPID CMD                         %MEM %CPU
 3973  3886 dbus-daemon --fork --sessio  0.6  0.0
 4085  3886 /usr/lib/unity/unity-panel-  0.6  0.0
 4253  4079 compiz                       0.3  0.4
11350  3886 /opt/google/chrome/chrome    0.3  2.1
18562 11366 /opt/google/chrome/chrome -  0.3  8.8
17402  1109 smbd -F                      0.1  0.0
 3213     1 /opt/google/chrome/chrome -  0.1  0.1
 1587  1450 /usr/bin/X -core :0 -seat s  0.1  0.3
 4019  3886 nautilus --new-window        0.1  0.0
Fri Sep  8 21:31:01 MDT 2017
Memory Usage: 63852/64239MB (99.40%)
Disk Usage: 2.7/3.6TB (75%)
Disk Usage: 2.5/3.6TB (68%)
Disk Usage: 2.3/3.6TB (64%)
Disk Usage: 4.3/5.5TB (83%)
CPU Load: 1.57
  PID  PPID CMD                         %MEM %CPU
 3973  3886 dbus-daemon --fork --sessio  0.6  0.0
 4085  3886 /usr/lib/unity/unity-panel-  0.6  0.0
 4253  4079 compiz                       0.3  0.4
11350  3886 /opt/google/chrome/chrome    0.3  2.1
18562 11366 /opt/google/chrome/chrome -  0.3  8.8
17402  1109 smbd -F                      0.1  0.0
 3213     1 /opt/google/chrome/chrome -  0.1  0.1
 1587  1450 /usr/bin/X -core :0 -seat s  0.1  0.3
 4019  3886 nautilus --new-window        0.1  0.0
Fri Sep  8 21:36:01 MDT 2017
Memory Usage: 63796/64239MB (99.31%)
Disk Usage: 2.7/3.6TB (75%)
Disk Usage: 2.5/3.6TB (68%)
Disk Usage: 2.3/3.6TB (64%)
Disk Usage: 4.3/5.5TB (83%)
CPU Load: 4.59
  PID  PPID CMD                         %MEM %CPU
 3973  3886 dbus-daemon --fork --sessio  0.6  0.0
 4085  3886 /usr/lib/unity/unity-panel-  0.6  0.0
 4253  4079 compiz                       0.3  0.4
11350  3886 /opt/google/chrome/chrome    0.3  2.1
18562 11366 /opt/google/chrome/chrome -  0.3  8.8
17402  1109 smbd -F                      0.1  0.0
 3213     1 /opt/google/chrome/chrome -  0.1  0.1
 1587  1450 /usr/bin/X -core :0 -seat s  0.1  0.3
 1873  1871 gdal_translate -ot Float32   0.1  0.1
Fri Sep  8 21:41:01 MDT 2017
Memory Usage: 63878/64239MB (99.44%)
Disk Usage: 2.7/3.6TB (75%)
Disk Usage: 2.5/3.6TB (68%)
Disk Usage: 2.3/3.6TB (64%)
Disk Usage: 4.3/5.5TB (83%)
CPU Load: 4.32
  PID  PPID CMD                         %MEM %CPU
 3973  3886 dbus-daemon --fork --sessio  0.6  0.0
 4085  3886 /usr/lib/unity/unity-panel-  0.6  0.0
 4253  4079 compiz                       0.3  0.4
11350  3886 /opt/google/chrome/chrome    0.3  2.1
18562 11366 /opt/google/chrome/chrome -  0.3  8.8
 5636  5634 /usr/bin/python /usr/bin/gd  0.1 25.6
17402  1109 smbd -F                      0.1  0.0
 3213     1 /opt/google/chrome/chrome -  0.1  0.1
 1587  1450 /usr/bin/X -core :0 -seat s  0.1  0.3

系统统计如下:

sar -f/var/log/sysstat/sa08|head -4 && sar -f/var/log/sysstat/sa08|grep '09:35:01 PM' -A 10 -B 10

Linux 3.19.0-80-generic (Tendua)        09/08/2017      _x86_64_        (12 CPU)

12:00:02 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
12:05:03 AM     all     51.97      0.00      1.32      0.89      0.00     45.82
07:55:01 PM     all      9.44      0.00      0.59      3.51      0.00     86.45
08:05:01 PM     all      1.50      0.00      1.12      9.39      0.00     87.99
08:15:01 PM     all      1.48      0.00      1.12      9.16      0.00     88.24
08:25:01 PM     all      1.46      0.00      1.13      9.29      0.00     88.12
08:35:01 PM     all      1.50      0.00      1.24      8.65      0.00     88.61
08:45:01 PM     all      1.52      0.00      1.57      7.87      0.00     89.04
08:55:01 PM     all      1.55      0.00      1.64      7.96      0.00     88.85
09:05:01 PM     all      1.53      0.00      1.70      8.08      0.00     88.69
09:15:01 PM     all      1.54      0.00      1.74      8.04      0.00     88.68
09:25:01 PM     all      1.50      0.00      1.69      8.16      0.00     88.65
09:35:01 PM     all      1.55      0.00      1.69      7.98      0.00     88.78
09:45:01 PM     all      2.17      0.00      1.96     11.67      0.00     84.20
09:55:01 PM     all      1.71      0.00      1.87     18.06      0.00     78.36
10:05:01 PM     all      1.70      0.00      2.13     17.57      0.00     78.60
10:15:01 PM     all      1.73      0.00      1.93     18.64      0.00     77.70
10:25:01 PM     all      1.75      0.00      1.96     18.36      0.00     77.93
10:35:01 PM     all      3.91      0.00      1.80     15.40      0.00     78.89
10:45:01 PM     all      1.66      0.00      1.90     19.23      0.00     77.20
10:55:01 PM     all      1.68      0.00      1.82     19.05      0.00     77.45
11:05:01 PM     all      1.69      0.00      1.77     19.31      0.00     77.23
11:15:01 PM     all      7.69      0.00      1.03      8.68      0.00     82.60

sar -r -f/var/log/sysstat/sa08|head -4 && sar -r -f/var/log/sysstat/sa08|grep '09:35:01 PM' -A 10 -B 10

Linux 3.19.0-80-generic (Tendua)        09/08/2017      _x86_64_        (12 CPU)

12:00:02 AM kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit  kbactive   kbinact   kbdirty
12:05:03 AM   3859704  61921864     94.13     50184  52583736  17699508     25.29  29962700  29837068    611932
07:55:01 PM    392740  65388828     99.40    407920  60884772  10851876     15.51   2703424  60625128   2853088
08:05:01 PM    402104  65379464     99.39    410904  61032736  10845076     15.50   3520252  59959668   1336572
08:15:01 PM    371832  65409736     99.43    413876  61069912  10859576     15.52   3403804  60108248   1220840
08:25:01 PM    397468  65384100     99.40    416592  61053400  10820532     15.46   1277516  62219572   3004488
08:35:01 PM    426124  65355444     99.35    419376  61037072  10808244     15.45   1069828  62416140    306880
08:45:01 PM    453688  65327880     99.31    423508  60966356  10864072     15.53   1070752  62351248    929620
08:55:01 PM    367620  65413948     99.44    427112  61029896  10804544     15.44   1078392  62411352    197616
09:05:01 PM    377924  65403644     99.43    430608  60967232  10794432     15.43   1082116  62345104    234916
09:15:01 PM    415260  65366308     99.37    433856  60842704  10822332     15.47   1099380  62217480   1687536
09:25:01 PM    452492  65329076     99.31    436928  60832668  10842884     15.50   1097624  62204328   1594256
09:35:01 PM    445220  65336348     99.32    460300  60819540  10855512     15.51   1137008  62187660       528
09:45:01 PM    470044  65311524     99.29    463480  60443220  10960588     15.66  24385076  38668700   1765864
09:55:01 PM    472608  65308960     99.28    464708  60422328  11067312     15.82  28572380  34443408   1358136
10:05:01 PM    433176  65348392     99.34    465852  60438212  10957232     15.66  29127204  33924120   4382320
10:15:01 PM    450192  65331376     99.32    467016  60405816  11024924     15.76  30376748  32652588   2063424
10:25:01 PM    376448  65405120     99.43    468388  60579636  10831104     15.48  31660420  31440712       704
10:35:01 PM    560628  65220940     99.15    469856  60329228  10979960     15.69  38914816  23997528   4796836
10:45:01 PM    521660  65259908     99.21    471004  60329408  11008184     15.73  41755472  21194232   3680356
10:55:01 PM    556596  65224972     99.15    472160  60329584  10921484     15.61  44714800  18191336   2351468
11:05:01 PM    543664  65237904     99.17    473288  60329768  11025832     15.76  46713556  16209272   3482536
11:15:01 PM    408040  65373528     99.38    475420  60500764  10935492     15.63  41843776  21182704    168384

sar -b -f/var/log/sysstat/sa08|head -4 && sar -b -f/var/log/sysstat/sa08|grep '09:35:01 PM' -A 10 -B 10

Linux 3.19.0-80-generic (Tendua)        09/08/2017      _x86_64_        (12 CPU)

12:00:02 AM       tps      rtps      wtps   bread/s   bwrtn/s
12:05:03 AM    135.06    125.31      9.75  32088.68  15436.97
07:55:01 PM    214.60    187.68     26.92  47229.75  31671.99
08:05:01 PM    407.57    334.90     72.67  85024.47  91730.47
08:15:01 PM    411.46    341.30     70.16  86763.79  87368.85
08:25:01 PM    420.48    353.86     66.62  90036.21  84149.34
08:35:01 PM    480.24    394.76     85.48 100597.74 109374.29
08:45:01 PM    634.61    530.99    103.62 134987.46 132954.73
08:55:01 PM    683.04    567.68    115.36 144695.57 147487.06
09:05:01 PM    712.35    594.86    117.50 151571.92 151528.64
09:15:01 PM    732.52    613.41    119.10 156291.37 151223.34
09:25:01 PM    716.90    598.66    118.24 152659.05 152950.26
09:35:01 PM    719.66    594.02    125.65 147751.96 153455.47
09:45:01 PM    449.31    379.28     70.03  96426.44 106934.35
09:55:01 PM    249.01     48.08    200.93  12311.36 304792.29
10:05:01 PM    225.01     32.41    192.60   8286.13 310338.90
10:15:01 PM    220.01     15.86    204.15   4051.83 312400.63
10:25:01 PM    210.24      8.80    201.44   1863.78 308831.78
10:35:01 PM    199.95     23.89    176.05   5375.87 249054.33
10:45:01 PM    204.32      0.00    204.32      0.00 322379.98
10:55:01 PM    201.33      0.00    201.33      0.00 314086.77
11:05:01 PM    206.65      0.00    206.64      0.04 317881.94
11:15:01 PM    168.05     32.33    135.72   2041.40 151844.90

sar -w -f/var/log/sysstat/sa08|head -4 && sar -w -f/var/log/sysstat/sa08|grep '09:35:01 PM' -A 10 -B 10

Linux 3.19.0-80-generic (Tendua)        09/08/2017      _x86_64_        (12 CPU)

12:00:02 AM    proc/s   cswch/s
12:05:03 AM      1.45    800.71
07:55:01 PM      0.82   1116.96
08:05:01 PM      0.86   1584.54
08:15:01 PM      0.82   1530.11
08:25:01 PM      0.70   1543.24
08:35:01 PM      0.71   1630.97
08:45:01 PM      0.72   1942.89
08:55:01 PM      0.78   1988.69
09:05:01 PM      0.77   2030.87
09:15:01 PM      1.03   2061.84
09:25:01 PM      0.86   2027.47
09:35:01 PM      0.82   2062.62
09:45:01 PM     19.65   1789.45
09:55:01 PM      0.73    945.56
10:05:01 PM      0.53    901.46
10:15:01 PM      0.68    873.46
10:25:01 PM      0.76    793.85
10:35:01 PM      7.85    950.96
10:45:01 PM      0.74    869.87
10:55:01 PM      0.79    913.40
11:05:01 PM      0.78    894.85
11:15:01 PM      0.80   1225.52

最佳答案

Since these scripts are called via cron with a configuration file, I would expect consistent results. Most of the time (and it seems always when I'm watching, even if I just run the script cron initiates), the code runs successfully to completion. But sometimes not.

看起来这是环境问题。尝试通过手动运行和 cron 作业记录 os.environ。

要查找的变量是 PATH

PATH is an environmental variable in Linux and other Unix-like operating systems that tells the shell which directories to search for executable files (i.e., ready-to-run programs) in response to commands issued by a user.

如果我的猜想是正确的——你可以通过手动设置来解决这个问题 os.environ['PATH'] = "path to binaries dir" 然后将 os.environ 添加到子进程调用的 cwd 参数中

关于python - 有时找不到 Linux 可执行文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46150785/

相关文章:

python - pygdal 版本配置错误

python - pypyodbc 无法使用 SQL Server Localdb 引用列名

python - 用于计数查询的 Google Cloud Datastore 索引

PHP - 限制套接字连接时间

linux - 找到 XZ 流的结尾

python-3.x - 为 Python 3.7.3 安装 GDAL 的问题

python - 无法使用 python 从我的计算机访问 Azure SQL DB

python - Celery RabbitMQ CloudAMQP 任务队列未在 Heroku 上使用

linux - 海合会 : error trying to exec 'cc1' when cross-compiling

python - 如何通过 Flask 应用程序在 Python 中使用 GDAL 打开远程文件