java - 尝试使用nutch进行爬网时出错-自己的本地主机名上的java.net.UnknownHostException

标签 java hadoop solr nutch

尝试在Centos 6.6上使用Nutch 1.9进行爬网。

按照本指南尝试初始化我的第一次爬网时:

http://wiki.apache.org/nutch/NutchTutorial

但是,启动时出现以下异常:

Injector: Converting injected urls to crawl db entries. Injector: java.net.UnknownHostException: Sparky.LITK: Sparky.LITK: Name or service not known at java.net.InetAddress.getLocalHost(InetAddress.java:1473) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:960) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1353) at org.apache.nutch.crawl.Injector.inject(Injector.java:324) at org.apache.nutch.crawl.Injector.run(Injector.java:380) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.nutch.crawl.Injector.main(Injector.java:370) Caused by: java.net.UnknownHostException: Sparky.LITK: Name or service not known at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293) at java.net.InetAddress.getLocalHost(InetAddress.java:1469) ... 12 more



似乎正在尝试爬网机器自己的主机名(Sparky.LITK),这不是我想要的,我按照教程设置了seed.txt列表,但它停留在这里。

最佳答案

该修补程序很简单,只需将计算机的主机名添加到/ etc / hosts文件中,使其指向回送地址(127.0.0.1)

我将主机条目修改如下:

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4 Sparky.LITK
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6 Sparky.LITK

而且有效!

关于java - 尝试使用nutch进行爬网时出错-自己的本地主机名上的java.net.UnknownHostException,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28228946/

相关文章:

hadoop - pig : ERROR 1000: Error during parsing

hadoop - 您会推荐使用 Hadoop/HBASE 吗?

search - SOLR df 和 qf 解释

java - 使用 java 时检测到代理

java - Spring 启动: Configuration Scan missing classes due to "weird" setup

java - 如何在 Java 8 中的两个流之间进行搜索

hadoop - prestodb配置单元SQL查询错误

java - Solr 4.3.0 HTTP 状态 500

solr - 用于自动完成的近实时 Solr Facet 查询

java - 如何将 org.eclipse.swt.widgets.Shell 添加到 JPanel