linux - 从 URL 批量下载

我想从某个 URL 下载数千个文件。 “FileName.txt”中的每一行包含要下载的文件的名称。我正在使用 Perl 脚本从“FileName.txt”中获取文件名，并在随机时间后下载它们。我将脚本运行为“./program.pl Filename.txt”

文件名.txt

A
B
C
B

程序.pl

#!/usr/bin/perl
$file1=$ARGV[0];
open(FP1, $file1);
while($s1=<FP1>)
<br>
{   chomp ($s1);
    $range = 5;
    $minimum = 3;

    $random_number = int(rand($range)) + $minimum;
    `wget --wait="$random_number" "http://URL=$s1"`;
}

我得到了几个初始文件的输出，但没有得到剩余文件的输出。对于剩余文件 $ emacs fileD.txt 给出

[13] 29699

您能否告诉我为什么我收到“[13] 29699”，以及在随机时间间隔后下载文件的最佳方式是什么。抱歉， while 程序未显示正确的处理程序。谢谢

最佳答案

您没有显示 $id 来自何处，但可能某些 URL 包含 & ，这会将进程置于后台。您应该对 wget 的参数使用单引号或使用 system 的列表形式.

此外，wget 的等待参数仅在您使用 wget 本身来遍历给定 URL 的链接时才相关。在您的情况下，您需要 Perl 脚本在为每个 URL 调用 wget 之间休眠:

#!/usr/bin/env perl

use strict;
use warnings;

use constant WAIT_MINIMUM => 3;
use constant WAIT_RANGE => 5;

my ($url_list_file) = @ARGV;
defined($url_list_file)
    or die "Need URL list\n";

open my $fh, '<', $url_list_file
    or die "Cannot open '$url_list_file': $!";

while (my $url = <$fh>) {
    $url =~ s/\R\z//;
    my @cmd =  (wget => 'http://$url');

    print "@cmd\n";
    my $error = system @cmd;

    if ($error) {
        warn "''@cmd' failed: $?";
    }
    sleep WAIT_MINIMUM + rand(WAIT_RANGE);
}

关于linux - 从 URL 批量下载，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/23568968/

linux - 从 URL 批量下载

上一篇：linux - tcl 脚本中止 : unable to realloc xxx bytes

下一篇：linux - 通过 LFS 7.5 构建 iso 时缺少 gd.h header 错误