java - 运行 mahout 时线程 "main"java.lang.NullPointerException 中的异常

标签 java hadoop mahout

您好,我是 Apache mahout 的新手,我在运行“classify-20newsgroups.sh”这个自动从互联网获取数据集的示例时遇到错误。

错误轨迹:

hduser@raj-Lenovo-G550:/usr/local/mahout/examples$ bin/classify-20newsgroups.sh
Please select a number to choose the corresponding task to run
1. cnaivebayes
2. naivebayes
3. sgd
4. clean -- cleans up the work area in /tmp/mahout-work-hduser
Enter your choice : 3
ok. You chose 3 and we'll use sgd
creating work directory at /tmp/mahout-work-hduser
Downloading 20news-bydate
bin/classify-20newsgroups.sh: line 68: curl: command not found
Extracting...
tar (child): ../20news-bydate.tar.gz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
Training on /tmp/mahout-work-hduser/20news-bydate/20news-bydate-train/
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /usr/local/hadoop-1.2.1/bin/hadoop and HADOOP_CONF_DIR=/usr/local/hadoop-1.2.1/conf
MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.9-job.jar
14/08/06 14:07:53 WARN driver.MahoutDriver: No org.apache.mahout.classifier.sgd.TrainNewsGroups.props found on classpath, will use command-line arguments only
Exception in thread "main" java.lang.NullPointerException
    at org.apache.mahout.classifier.sgd.TrainNewsGroups.main(TrainNewsGroups.java:106)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
    at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160)

任何人请在这里帮忙

已编辑: 我试过 使用 sudo apt-get install curl 但得到了

hduser@raj-Lenovo-G550:/usr/local/mahout/examples$ bin/classify-20newsgroups.sh
Please select a number to choose the corresponding task to run
1. cnaivebayes
2. naivebayes
3. sgd
4. clean -- cleans up the work area in /tmp/mahout-work-hduser
Enter your choice : 3
ok. You chose 3 and we'll use sgd
creating work directory at /tmp/mahout-work-hduser
Training on /tmp/mahout-work-hduser/20news-bydate/20news-bydate-train/
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /usr/local/hadoop-1.2.1/bin/hadoop and HADOOP_CONF_DIR=/usr/local/hadoop-1.2.1/conf/
MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.9-job.jar
14/08/06 17:06:41 WARN driver.MahoutDriver: No org.apache.mahout.classifier.sgd.TrainNewsGroups.props found on classpath, will use command-line arguments only
Exception in thread "main" java.lang.NullPointerException
    at org.apache.mahout.classifier.sgd.TrainNewsGroups.main(TrainNewsGroups.java:106)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
    at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160)

最佳答案

这里的问题是无法用curl命令下载语料库20newsgroups,因为在操作系统中找不到,看下面一行错误:bin/classify-20newsgroups.sh:第 68 行:curl:找不到命令

关于java - 运行 mahout 时线程 "main"java.lang.NullPointerException 中的异常,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25155795/

相关文章:

java - 使用Maven for Java程序在hdfs上写入时的Hadoop错误

mysql - 如何将数据从 Hadoop 导出到 MySQL/任何数据库?

python - 我的 boto elastic mapreduce jar 作业流程参数有什么问题?

eclipse - 在 Eclipse 中使用 mahout 而不使用 MAVEN

java - 有没有办法向 InitialLdapContext 提供 SocketFactory_instance_?

Java菜单循环

hadoop - 使用 S3AFileSystem 的 Flink 不会从 S3 读取子文件夹

java - 如何在 mahout 中矢量化文本文件?

java - 如何在静态泛型工厂方法中避免 "Type mismatch"?

java - Hibernate在select子句中获取存储在map中的对象的字段