linux - 如何在 Linux 中使用 lynx/w3m 提取多个 URL 的文本

我在一个文本文件中列出了 50 个奇怪的 URL(每行一个 URL)。现在，对于每个 URL，我想提取网站的文本并将其保存下来。这听起来像是 Linux 中的 shell 脚本的工作。

目前我正在整理东西:

说 sed -n 1p listofurls.txt 我可以读取我的 URL 文件中的第一行 listofurls.txt
使用 lynx -dump www.firsturl... 我可以使用管道输出通过各种命令来整理和清理它。完成，行得通。

在自动化之前，我正在努力将 URL 管道输入到 lynx 中:say

sed -n 1p listofurls.txt | lynx -dump -stdin

不起作用。

对于一个 URL，更重要的是，对于我在 listofurls.txt 中的每个 URL，我该如何做到这一点？

最佳答案

你可以这样写脚本

vi script.sh

#content of script.sh#
while read line
do
    name=$line
    wget $name
    echo "Downloaded content from - $name"
done < $1
#end#

chmod 777 script.sh

./script.sh listofurls.txt

关于linux - 如何在 Linux 中使用 lynx/w3m 提取多个 URL 的文本，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/24466404/

上一篇：linux - 如何在 Ubuntu 中打开在 Windows 7 中创建的 NetBeans 项目？

下一篇：linux - clock_gettime 在 android 上不是单调的

相关文章：

Ubuntu 16.04无法安装Python 3.6.0？

linux - 如果遇到10或20如何删除变量的最后一位？

bash - 如何为 bash 详细模式设置前缀

linux - Ubuntu iNotify 多文件夹

linux - 如何用文件中的绝对路径替换变量值？

bash - 在 Mac 上使用 sed 递归地用空格替换字符串

linux - 如何提取字符串开头的一部分？

Linux:禁用交换时 mmap() 的行为

linux - 将 mpg123 中当前播放的歌曲名称写入文件

bash - 如何将base64编码的内容传递给sed？