macos - 在 Mac 上以伪分布式模式设置 Hadoop

标签 macos hadoop installation

我目前正在尝试在运行 Mountain Lion 的 Mac 上以伪分布式模式设置 Hadoop。我下载了 Hadoop 1.0.4 并采取了以下步骤,详见 Chuck Lam 的“Hadoop in Action”:

1) 生成一个 SSH key 对:我运行了 ssh-keygen -t rsa生成一对并没有设置密码。我将 key 放在/Users/me/.ssh/id_rsa.pub 中。然后我将此文件复制到 ~/.ssh/authorized_keys。这允许我在不提供密码的情况下从我自己的机器 SSH 到我自己的机器。

2) 设置 JAVA_HOME:我修改了 conf/hadoop-env.sh 以包含 export JAVA_HOME=/Library/Java/Home ,我相信这是我的Java安装目录。 (作为引用,此目录包含 bin、bundle、lib 和 man。)

3)设置站点配置文件:我复制粘贴了书中建议的配置。他们是:
核心站点.xml


<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>
 <name>fs.default.name</name>
 <value>hdfs://localhost:9000</value>
 <description>The name of the default file system. A URI whose
 scheme and authority determine the FileSystem implementation. 
</description>
</property>

</configuration>

mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>
 <name>mapred.job.tracker</name>
 <value>localhost:9001</value>
 <description>The host and port that the MapReduce job tracker runs
 at.</description>
</property>

</configuration>

hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

 <!-- Put site-specific property overrides in this file. -->

<configuration>

<property>
 <name>dfs.replication</name>
 <value>1</value>
 <description>The actual number of replications can be specified when the
 file is created.</description>
</property>

</configuration>

4)设置master和slave。我的 conf/masters 和 conf/slaves 文件只包含 localhost。

5)格式化HDFS:bin/hadoop namenode -format我得到以下输出:
12/11/16 13:20:12 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = dhcp-18-111-53-8.dyn.mit.edu/18.111.53.8
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 1.0.4
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; compiled by 'hortonfo' on Wed Oct  3 05:13:58 UTC 2012
************************************************************/
Re-format filesystem in /tmp/hadoop-me/dfs/name ? (Y or N) Y
12/11/16 13:20:17 INFO util.GSet: VM type       = 64-bit
12/11/16 13:20:17 INFO util.GSet: 2% max memory = 39.83375 MB
12/11/16 13:20:17 INFO util.GSet: capacity      = 2^22 = 4194304 entries
12/11/16 13:20:17 INFO util.GSet: recommended=4194304, actual=4194304
12/11/16 13:20:17 INFO namenode.FSNamesystem: fsOwner=me
12/11/16 13:20:18 INFO namenode.FSNamesystem: supergroup=supergroup
12/11/16 13:20:18 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/11/16 13:20:18 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
12/11/16 13:20:18 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
12/11/16 13:20:18 INFO namenode.NameNode: Caching file names occuring more than 10 times 
12/11/16 13:20:18 INFO common.Storage: Image file of size 119 saved in 0 seconds.
12/11/16 13:20:18 INFO common.Storage: Storage directory /tmp/hadoop-me/dfs/name has been successfully formatted.
12/11/16 13:20:18 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at dhcp-18-111-53-8.dyn.mit.edu/18.111.53.8
************************************************************/

6) 发射:bin/start-all.sh我得到以下输出:
starting namenode, logging to /Users/me/hadoop-1.0.4/libexec/../logs/hadoop-me-namenode-dhcp-18-111-53-8.dyn.mit.edu.out
localhost: starting datanode, logging to /Users/me/hadoop-1.0.4/libexec/../logs/hadoop-me-datanode-dhcp-18-111-53-8.dyn.mit.edu.out
localhost: starting secondarynamenode, logging to /Users/me/hadoop-1.0.4/libexec/../logs/hadoop-me-secondarynamenode-dhcp-18-111-53-8.dyn.mit.edu.out
starting jobtracker, logging to /Users/me/hadoop-1.0.4/libexec/../logs/hadoop-me-jobtracker-dhcp-18-111-53-8.dyn.mit.edu.out
localhost: starting tasktracker, logging to /Users/me/hadoop-1.0.4/libexec/../logs/hadoop-me-tasktracker-dhcp-18-111-53-8.dyn.mit.edu.out

该文本现在声称我应该能够运行 jps 并获得类似于以下内容的输出:
26893 Jps
26832 TaskTracker
26620 SecondaryNameNode
26333 NameNode
26484 DataNode
26703 JobTracker

但是,我只得到:
71311 Jps

所以我认为出了点问题,但不知道我哪里出错了。有什么建议么?谢谢。

最佳答案

Hadoop 日志会告诉你原因。

查看 /Users/me/hadoop-1.0.4/libexec/../logs/ 下的日志

我认为问题可能是.ssh的权限目录不正确。

试试 chmod 700 ~/.ssh; chmod 600 ~/.ssh/id_rsa

关于macos - 在 Mac 上以伪分布式模式设置 Hadoop,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13422730/

相关文章:

ios - 为 Mac OS X 和 iOS 开发

macos - Bash .profile 未加载

hadoop - Apache Pig登录用户的权限被拒绝错误

java - Hadoop日志文件分析

windows-7 - WAMP 和 EasyPHP 可以共存吗?为什么会这样?

objective-c - mac cocoa 开发工具集是否被认为是快速应用程序开发?

macos - 在 Mac OSX Lion(10.7) 上设置和使用 OpenGL 3.0+

hadoop - 名称节点和数据节点之间的hdfs文件系统差异

php - 如何修复 Composer 错误 : "could not scan for classes inside dir"?

java - 无法安装Android开发工具包: Says Missing JDK