我有一个具有8GB RAM的8核CPU,并且我正在创建一个批处理文件以自动执行7-zip CLI,用尽大多数参数和变量来压缩同一组文件,最终目的是找到最强大的参数和参数组合。导致文件大小最小的变量。
从本质上讲,这非常耗时,尤其是当要处理的文件集为千兆字节时。我不仅需要一种自动化方法,而且需要加快整个过程的速度。
7-zip使用不同的压缩算法,有些仅是单线程的,有些是多线程的,有些不需要大量的内存,有些则需要大量的内存,甚至可以超过8GB的障碍。我已经成功创建了一个自动批处理,该批处理按顺序工作,不包括需要8GB以上内存的组合。
我将不同的压缩算法分为几批进行了简化,以简化整个过程。例如,作为7z存档的PPMd中的压缩使用1线程和最大1024MB。这是我当前的批次:
@echo off
echo mem=1m 2m 3m 4m 6m 8m 12m 16m 24m 32m 48m 64m 96m 128m 192m 256m 384m 512m 768m 1024m
echo o=2 3 4 5 6 7 8 10 12 14 16 20 24 28 32
echo s=off 1m 2m 4m 8m 16m 32m 64m 128m 256m 512m 1g 2g 4g 8g 16g 32g 64g on
echo x=1 3 5 7 9
for %%x IN (9) DO for %%d IN (1024m 768m 512m 384m 256m 192m 128m 96m 64m 48m 32m 24m 16m 12m 8m 6m 4m 3m 2m 1m) DO for %%w IN (32 28 24 20 16 14 12 10 8 7 6 5 4 3 2) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.ppmd.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -m0=PPMd:mem=%%d:o=%%w -ms=%%s
exit
x
,s
,o
和mem
是参数,它们后面分别是7z.exe将使用的变量。在这种情况下,x
和s
无关紧要,它们表示存档的压缩强度和实体块大小。该批处理将正常工作,但仅限于一次仅运行1个7z.exe实例,现在我正在寻找一种方法来使其并行运行更多7z.exe实例,但不超过8GB RAM或8个线程立即进行,以先到者为准,然后再继续执行序列中的下一个。
我该如何改善呢?我有一些想法,但我不知道如何使它们分批工作。我在考虑其他两个变量,它们不会与7z进程交互,但会控制下一个7z实例何时启动。一个变量将跟踪当前正在使用多少线程,而另一个变量将跟踪正在使用多少内存。那行得通吗?
编辑:
抱歉,我需要添加详细信息,我是这种发布风格的新手。按照这个答案-https://stackoverflow.com/a/19481253/2896127-我提到创建了8个批次,而7z.PPMd批次就是其中之一。也许列出所有批次以及7z如何处理参数将对整个问题有更好的了解。我将从简单的开始:
我用部分利用的线程的意思是,虽然我分配了每个7.exe实例要使用的8个线程,但是该算法可以以一种不受控制的方式随机地进行可变的CPU使用,这是不可预知的,但是在这里设置了限制-不超过8个线程。对于8个完全利用的线程,这意味着在我的8个核心CPU上,每个实例都在使用100%的CPU。
最复杂的部分-7z.LZMA,7z.LZMA2,zip.LZMA-将需要详细说明,但我现在时间紧迫。每当我有更多空闲时间时,我都会回去编辑LZMA部分。
再次感谢。
编辑:在LZMA部分中添加。
我正在尝试了解其中包含nul的命令的行为。我不太了解那部分发生了什么,这些符号^> ^&1“”是要说什么。
2>nul del %lock%!nextProc!
%= Redirect the lock handle to the lock file. The CMD process will =%
%= maintain an exclusive lock on the lock file until the process ends. =%
start /b "" cmd /c %lockHandle%^>"%lock%!nextProc!" 2^>^&1 !cpu%%N! !cmd!
)
set "launch="
然后,在:wait代码处:
) 9>>"%lock%%%N"
) 2>nul
if %endCount% lss %startCount% (
1>nul 2>nul ping /n 2 ::1
goto :wait
)
2>nul del %lock%*
编辑2(29-10-2013):添加当前情况。
经过反复试验研究,并通过逐步记录正在发生的事情,我能够理解上述行为。我用start命令简化了这一行:
start /b /low cmd /c !cmd!>"%lock%!nextProc!"
虽然有效,但我仍然不了解
1^>"filename" 2^>^&1 'command'
的含义。我知道这与在文件名中写入文本有关,否则会显示给我。在这种情况下,它将显示所有7z.exe文本,但写入文件中。在7z.exe实例完成工作之前,不会在文件中写入任何内容,但是该文件已经存在,但同时不存在。当7z.exe实际完成时,文件已完成,这一次它存在于脚本的下一部分。现在,我可以理解建议脚本的处理行为,并用自己的东西对其进行补充-我正在尝试将所有批次实现为“一个批次完成所有任务”脚本。在简化版本中,是这样的:
echo 8 threads - maxproc=1
for %%x IN (9) DO for %%t IN (8) DO for %%d IN (900k) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.bzip2.%%tt.%%dd.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=BZip2:d=%%d:mt=%%t
for %%x IN (9) DO for %%t IN (8) DO for %%d IN (900k) DO 7z.exe a teste.resultado\%%xx.bzip2.%%tt.%%dd.zip .\teste.original\* -mx=%%x -mm=BZip2:d=%%d -mmt=%%t
for %%x IN (9) DO for %%t IN (8) DO for %%w IN (257 256 192 128 96 64 48 32 24 16 12 8) DO 7z.exe a teste.resultado\%%xx.deflate64.%%tt.%%ww.zip .\teste.original\* -mx=%%x -mm=deflate64:fb=%%w -mmt=%%t
for %%x IN (9) DO for %%t IN (8) DO for %%w IN (258 256 192 128 96 64 48 32 24 16 12 8) DO 7z.exe a teste.resultado\%%xx.deflate.%%tt.%%ww.zip .\teste.original\* -mx=%%x -mm=deflate:fb=%%w -mmt=%%t
for %%x IN (9) DO for %%t IN (8) DO for %%d IN (256m 128m 64m 32m 16m 8m 4m 2m 1m) DO for %%w IN (16 15 14 13 12 11 10 9 8 7 6 5 4 3 2) DO 7z.exe a teste.resultado\%%xx.ppmd.%%tt.%%dd.%%ww.zip .\teste.original\* -mx=%%x -mm=PPMd:mem=%%d:o=%%w -mmt=%%t
echo 4 threads - maxproc=2
for %%x IN (9) DO for %%t IN (4) DO for %%d IN (256m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma2.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=lzma2:d=%%d:fb=%%w -mmt=%%t
echo 2 threads - maxproc=4
for %%x IN (9) DO for %%t IN (2) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=LZMA:d=%%d:fb=%%w -mmt=%%t
for %%x IN (9) DO for %%t IN (2) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma2.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=lzma2:d=%%d:fb=%%w -mmt=%%t
for %%x IN (9) DO for %%t IN (2) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO 7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.zip .\teste.original\* -mx=%%x -mm=lzma:d=%%d:fb=%%w -mmt=%%t
echo 1 threads - maxproc=8
for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=LZMA:d=%%d:fb=%%w -mmt=%%t
for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma2.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=lzma2:d=%%d:fb=%%w -mmt=%%t
for %%x IN (9) DO for %%d IN (1024m 768m 512m 384m 256m 192m 128m 96m 64m 48m 32m 24m 16m 12m 8m 6m 4m 3m 2m 1m) DO for %%w IN (32 28 24 20 16 14 12 10 8 7 6 5 4 3 2) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.ppmd.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -m0=PPMd:mem=%%d:o=%%w -ms=%%s
for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO 7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.zip .\teste.original\* -mx=%%x -mm=lzma:d=%%d:fb=%%w -mmt=%%t
简而言之,我想以最有效的方式处理所有这些问题。通过确定一次可以运行多少个进程来做到这一点,但这是每个进程所需的内存,因此这些进程所需的所有内存之和不会超过8192 MB。我这部分工作了。
@echo off
setlocal enableDelayedExpansion
set "maxMem=8192"
set "maxThreads=8"
:cycle1
set "cycleCount=4"
set "cycleThreads=1"
set "maxProc="
set /a "maxProc=maxThreads/cycleThreads"
set "cycleFor1=for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO ("
set "cycleFor2=for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO ("
set "cycleFor3=for %%x IN (9) DO for %%d IN (1024m 768m 512m 384m 256m 192m 128m 96m 64m 48m 32m 24m 16m 12m 8m 6m 4m 3m 2m 1m) DO for %%w IN (32 28 24 20 16 14 12 10 8 7 6 5 4 3 2) DO for %%s IN (on) DO ("
set "cycleFor4=for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO ("
set "cycleCmd1=7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=LZMA:d=%%d:fb=%%w -mmt=%%t"
set "cycleCmd2=7z.exe a teste.resultado\%%xx.lzma2.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=lzma2:d=%%d:fb=%%w -mmt=%%t"
set "cycleCmd3=7z.exe a teste.resultado\%%xx.ppmd.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -m0=PPMd:mem=%%d:o=%%w -ms=%%s"
set "cycleCmd4=7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.zip .\teste.original\* -mx=%%x -mm=lzma:d=%%d:fb=%%w -mmt=%%t"
set "tempMem1=5407"
set "tempMem2=5407"
set "tempMem3=1055"
set "tempMem4=5378"
rem set "tempMem1=5407"
rem set "tempMem2=5407"
rem set "tempMem3=1055 799 543 415 287 223 159 127 95 79 63 55 47 43 39 37 35 34 33 32"
rem set "tempMem4=5378"
set "memSum=0"
if not defined memRem set "memRem=!maxMem!"
for /l %%N in (1 1 %cycleCount%) DO (set "tempProc%%N=")
for /l %%N in (1 1 %cycleCount%) DO (
set memRem
set /a "tempProc%%N=%memRem%/tempMem%%N"
set /a "memSum+=tempMem%%N"
set /a "memRem-=tempMem%%N"
set /a "maxProc=!tempProc%%N!"
call :executeCycle
set /a "memRem+=tempMem%%N"
set /a "memSum-=tempMem%%N"
set /a "maxProc-=!tempProc%%!
)
goto :fim
:executeCycle
set "lock=lock_%random%_"
set /a "startCount=0, endCount=0"
for /l %%N in (1 1 %maxProc%) DO set "endProc%%N="
set launch=1
for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO (
set "cmd=7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=LZMA:d=%%d:fb=%%w -mmt=%%t"
if !startCount! lss %maxProc% (
set /a "startCount+=1, nextProc=startCount"
) else (
call :wait
)
set cmd!nextProc!=!cmd!
echo !time! - proc!nextProc!: starting !cmd!
2>nul del %lock%!nextProc!
start /b /low cmd /c !cmd!>"%lock%!nextProc!"
)
set "launch="
:wait
for /l %%N in (1 1 %startCount%) do (
if not defined endProc%%N if exist "%lock%%%N" (
echo !time! - proc%%N: finished !cmd%%N!
if defined launch (
set nextProc=%%N
exit /b
)
set /a "endCount+=1, endProc%%N=1"
) 9>>"%lock%%%N"
) 2>nul
if %endCount% lss %startCount% (
1>nul 2>nul ping /n 2 ::1
goto :wait
)
2>nul del %lock%*
echo ===
echo Thats all folks!
exit /b
:fim
pause
我在
cycleFor1
部分中的cycleCmd1
和:cycle1
遇到了麻烦-他们应该替换for
行和cmd
中的第一个:executeCycle
变量,以使其按预期工作。我怎么做?我遇到的另一个问题是关于
tempMem3
。我已经记录了命令cycleCmd3
运行时所需的所有内存。它是字典相关的。 tempMem3和cycleCmd3的相关关系如下:for %%d IN (1024m 768m 512m 384m 256m 192m 128m 96m 64m 48m 32m 24m 16m 12m 8m 6m 4m 3m 2m 1m) DO
set "tempMem3=1055 799 543 415 287 223 159 127 95 79 63 55 47 43 39 37 35 34 33 32"
因此1024m将使用1055,768m将使用799,依此类推,直到使用32的1m。我不知道如何将其转换为脚本。
任何帮助表示赞赏。
最佳答案
我已经在Parallel execution of shell processes上发布了一个强大的批处理解决方案,该解决方案限制了并行进程的数量。该脚本使用脚本中嵌入的命令列表。点击链接查看其工作原理。
我修改了该脚本以根据您的问题使用FOR循环生成命令。我还将限制设置为同时进行8个进程。
您的最大内存为1g,而且您的进程最多不能超过8个,因此我看不到您怎么可能超过8g。如果增加每个进程的最大内存,则必须担心总内存。您将必须添加其他逻辑来跟踪正在使用的内存量以及可用的CPU ID。请注意,批号限制为〜2g,因此我建议计算以兆字节为单位的内存。
默认情况下,脚本隐藏命令的输出。如果要查看输出,请使用/O
选项运行它。
@echo off
setlocal enableDelayedExpansion
:: Display the output of each process if the /O option is used
:: else ignore the output of each process
if /i "%~1" equ "/O" (
set "lockHandle=1"
set "showOutput=1"
) else (
set "lockHandle=1^>nul 9"
set "showOutput="
)
:: Define the maximum number of parallel processes to run.
:: Each process number can optionally be assigned to a particular server
:: and/or cpu via psexec specs (untested).
set "maxProc=8"
:: Optional - Define CPU targets in terms of PSEXEC specs
:: (everything but the command)
::
:: If a cpu is not defined for a proc, then it will be run on the local machine.
:: I haven't tested this feature, but it seems like it should work.
::
:: set cpu1=psexec \\server1 ...
:: set cpu2=psexec \\server1 ...
:: set cpu3=psexec \\server2 ...
:: etc.
:: For this demo force all cpu specs to undefined (local machine)
for /l %%N in (1 1 %maxProc%) do set "cpu%%N="
:: Get a unique base lock name for this particular instantiation.
:: Incorporate a timestamp from WMIC if possible, but don't fail if
:: WMIC not available. Also incorporate a random number.
set "lock="
for /f "skip=1 delims=-+ " %%T in ('2^>nul wmic os get localdatetime') do (
set "lock=%%T"
goto :break
)
:break
set "lock=%temp%\lock%lock%_%random%_"
:: Initialize the counters
set /a "startCount=0, endCount=0"
:: Clear any existing end flags
for /l %%N in (1 1 %maxProc%) do set "endProc%%N="
:: Launch the commands in a loop
set launch=1
echo mem=1m 2m 3m 4m 6m 8m 12m 16m 24m 32m 48m 64m 96m 128m 192m 256m 384m 512m 768m 1024m
echo o=2 3 4 5 6 7 8 10 12 14 16 20 24 28 32
echo s=off 1m 2m 4m 8m 16m 32m 64m 128m 256m 512m 1g 2g 4g 8g 16g 32g 64g on
echo x=1 3 5 7 9
for %%x IN (9) DO for %%d IN (1024m 768m 512m 384m 256m 192m 128m 96m 64m 48m 32m 24m 16m 12m 8m 6m 4m 3m 2m 1m) DO (
set "cmd=7z.exe a teste.resultado\%%xx.ppmd.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -m0=PPMd:mem=%%d:o=%%w -ms=%%s"
if !startCount! lss %maxProc% (
set /a "startCount+=1, nextProc=startCount"
) else (
call :wait
)
set cmd!nextProc!=!cmd!
if defined showOutput echo -------------------------------------------------------------------------------
echo !time! - proc!nextProc!: starting !cmd!
2>nul del %lock%!nextProc!
%= Redirect the lock handle to the lock file. The CMD process will =%
%= maintain an exclusive lock on the lock file until the process ends. =%
start /b "" cmd /c %lockHandle%^>"%lock%!nextProc!" 2^>^&1 !cpu%%N! !cmd!
)
set "launch="
:wait
:: Wait for procs to finish in a loop
:: If still launching then return as soon as a proc ends
:: else wait for all procs to finish
:: redirect stderr to null to suppress any error message if redirection
:: within the loop fails.
for /l %%N in (1 1 %startCount%) do (
%= Redirect an unused file handle to the lock file. If the process is =%
%= still running then redirection will fail and the IF body will not run =%
if not defined endProc%%N if exist "%lock%%%N" (
%= Made it inside the IF body so the process must have finished =%
if defined showOutput echo ===============================================================================
echo !time! - proc%%N: finished !cmd%%N!
if defined showOutput type "%lock%%%N"
if defined launch (
set nextProc=%%N
exit /b
)
set /a "endCount+=1, endProc%%N=1"
) 9>>"%lock%%%N"
) 2>nul
if %endCount% lss %startCount% (
1>nul 2>nul ping /n 2 ::1
goto :wait
)
2>nul del %lock%*
if defined showOutput echo ===============================================================================
echo Thats all folks!
关于multithreading - 批处理和启动命令以进行并行和顺序工作,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19458790/