hadoop - Hadoop流式传输权限问题

标签 hadoop permissions mapreduce permission-denied hadoop-streaming

在hadoop流中需要调试权限问题方面的帮助。我尝试启动awk流:

// mkdir到所有节点

[pocal@oscbda01 ~]$  for i in  01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 ;  do ssh -f oscbda$i mkdir -p /home/pocal/KS/comverse/awk/; done;

//将流文件复制到所有节点
[pocal@oscbda01 ~]$  for i in  01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 ;  do scp * oscbda$i:/home/pocal/KS/comverse/awk/; done;

//给所有文件777权限
[pocal@oscbda01 ~]$  for i in  01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 ;  do ssh -f oscbda$i chmod 777 /home/pocal/KS/comverse/awk/*; done;

//开始流式传输
[pocal@oscbda01 ~]$ hadoop fs -rm -r /user/pocal/ks/comverse/one/out;\
hadoop jar  /usr/lib/hadoop-mapreduce/hadoop-streaming-2.0.0-cdh4.3.0.jar \
-Dmapreduce.job.reduces=0 \
-Dmapred.reduce.tasks=0 \
-mapper "awk -f /home/pocal/KS/comverse/awk/data_change.awk -f /home/pocal/KS/comverse/awk/selfcare.awk -f /home/pocal/KS/comverse/awk/selfcare_secondary_mapping.awk -f /home/pocal/KS/comverse/awk/out_sort.awk" \
-input "/user/pocal/ks/comverse/one/" \
-output "/user/pocal/ks/comverse/one/out"

并得到错误...
………..
attempt_201311041208_1379_m_000010_2: awk: fatal: can't open source file `/home/pocal/KS/comverse/awk/data_change.awk' for reading (Permission denied)
13/12/12 09:01:32 INFO mapred.JobClient: Task Id : attempt_201311041208_1379_m_000004_2, Status : FAILED
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
        at org.apache.hadoop.mapred.Child.main(Child.java:262)

attempt_201311041208_1379_m_000004_2: awk: fatal: can't open source file `/home/pocal/KS/comverse/awk/data_change.awk' for reading (Permission denied)
13/12/12 09:01:33 INFO mapred.JobClient: Task Id : attempt_201311041208_1379_m_000003_2, Status : FAILED
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
        at org.apache.hadoop.mapred.Child.main(Child.java:262)

attempt_201311041208_1379_m_000003_2: awk: fatal: can't open source file `/home/pocal/KS/comverse/awk/data_change.awk' for reading (Permission denied)
13/12/12 09:01:37 INFO mapred.JobClient: Job complete: job_201311041208_1379
13/12/12 09:01:37 INFO mapred.JobClient: Counters: 8
13/12/12 09:01:37 INFO mapred.JobClient:   Job Counters
13/12/12 09:01:37 INFO mapred.JobClient:     Failed map tasks=1
13/12/12 09:01:37 INFO mapred.JobClient:     Launched map tasks=52
13/12/12 09:01:37 INFO mapred.JobClient:     Data-local map tasks=12
13/12/12 09:01:37 INFO mapred.JobClient:     Rack-local map tasks=40
13/12/12 09:01:37 INFO mapred.JobClient:     Total time spent by all maps in occupied slots (ms)=348738
13/12/12 09:01:37 INFO mapred.JobClient:     Total time spent by all reduces in occupied slots (ms)=2952
13/12/12 09:01:37 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/12/12 09:01:37 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/12/12 09:01:37 ERROR streaming.StreamJob: Job not Successful!
Streaming Command Failed!

检查一台机器:
[pocal@oscbda01 ~]$ ssh oscbda10 ls -l /home/pocal/KS/comverse/awk/data_change.awk
-rwxrwxrwx 1 pocal pocal 1548 Dec 10 10:05 /home/pocal/KS/comverse/awk/data_change.awk

权限好…

有人有什么想法吗?

最佳答案

父目录有问题:

[pocal@oscbda01 .ssh]$ ssh oscbda05 ls -l /home/|grep pocal
total 24
drwx------ 5 pocal  pocal    4096 Dec 10 09:52 pocal
[pocal@oscbda01 .ssh]$ ssh oscbda05 ls -l /home/pocal
total 4
drwxrwxrwx 3 pocal pocal 4096 Dec 10 09:52 KS
[pocal@oscbda01 .ssh]$ ssh oscbda05 ls -l /home/pocal/KS
total 4
drwxrwxrwx 3 pocal pocal 4096 Dec 10 09:52 comverse
[pocal@oscbda01 .ssh]$ ssh oscbda05 ls -l /home/pocal/KS/comverse/
total 4
drwxrwxrwx 2 pocal pocal 4096 Dec 10 10:05 awk
[pocal@oscbda01 .ssh]$ ssh oscbda05 ls -l /home/pocal/KS/comverse/awk/
total 216
-rwxrwxrwx 1 pocal pocal  4398 Dec 10 10:05 calltype_checker.awk
-rwxrwxrwx 1 pocal pocal 16173 Dec 10 10:05 ch_rebuild.c
-rwxrwxrwx 1 pocal pocal 14643 Dec 10 10:05 ch_rebuild.dat
-rwxrwxrwx 1 pocal pocal  1548 Dec 10 10:05 data_change.awk
-rwxrwxrwx 1 pocal pocal  4080 Dec 10 10:05 decompress_incomming_data.sh
-rwxrwxrwx 1 pocal pocal   720 Dec 10 10:05 fms.awk
-rwxrwxrwx 1 pocal pocal  2502 Dec 10 10:05 load_func
-rwxrwxrwx 1 pocal pocal  1308 Dec 10 10:05 load_vars
-rwxrwxrwx 1 pocal pocal   199 Dec 10 10:05 load_vars_dynamic
-rwxrwxrwx 1 pocal pocal  1358 Dec 10 10:05 out.awk
-rwxrwxrwx 1 pocal pocal  1296 Dec 10 10:05 out_nosort.awk
-rwxrwxrwx 1 pocal pocal  1358 Dec 10 10:05 out_sort.awk
-rwxrwxrwx 1 pocal pocal 70041 Dec 10 10:05 selfcare.awk
-rwxrwxrwx 1 pocal pocal 54204 Dec 10 10:05 selfcare_secondary_mapping.awk
-rwxrwxrwx 1 pocal pocal  1847 Dec 10 10:05 stat.awk

Add permission:
[pocal@oscbda01 .ssh]$ for i in  01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 ;  do ssh -f oscbda$i chmod +x /home/pocal; done;
Check:
[pocal@oscbda01 .ssh]$ ssh oscbda05 ls -l /home/|grep poc
drwx--x--x 5 pocal  pocal    4096 Dec 10 09:52 pocal

而且有效!

关于hadoop - Hadoop流式传输权限问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20539481/

相关文章:

hadoop - Oozie 3.1.3 中的 Hive 操作

javascript - 权限更改后,我的 ASP.NET 应用程序似乎没有使用该应用程序的样式表

node.js - 如何在 mongoose/mongodb 查询子文档中使用 mapreduce?

hadoop - 如何编号分割并选择正确数量的映射器/缩小器

hadoop - hadoop mapreduce 框架将我的 System.out.print() 语句发送到哪里? (标准输出)

hadoop - 将数据插入 Hive 分区表时失败 : SemanticException org. apache.hadoop.hive.ql.metadata.HiveException

hadoop - 在 Pig 中使用带 MATCHES 的双引号

linux - 如何更改文件夹及其子文件夹/文件的权限?

laravel - Vue-router:如果用户没有权限,则重定向到路由

hadoop - 作业完成后如何更改distributedCache的内容?