amazon-web-services - 亚马逊 S3 错误代码 : 400 while running mr-job on EMR

标签 amazon-web-services hadoop mapreduce elastic-map-reduce

在 EMR 上运行自定义 jar 时出现此错误。

Exception in thread "main" com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: B042BB0B40A75966), S3 Extended Request ID: vr/DUr8HD3xjomauyzqvVdGuW3fHBP8PDUmTIAoVLUxrmsxh9H+OS9+cgo4OmHxaz/b8CSPGmuc=
    at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1389)
    at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:902)
    at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:607)
    at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:376)
    at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:338)
    at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:287)
    at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3826)
    at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1071)
    at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1029)
    at com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.ensureBucketExists(Jets3tNativeFileSystemStore.java:138)
    at com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.initialize(Jets3tNativeFileSystemStore.java:116)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy23.initialize(Unknown Source)
    at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.initialize(S3NativeFileSystem.java:461)
    at com.amazon.ws.emr.hadoop.fs.EmrFileSystem.initialize(EmrFileSystem.java:110)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2703)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2737)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2719)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:375)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
    at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.addInputPath(FileInputFormat.java:485)
    at SentimentsDriver.run(SentimentsDriver.java:21)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
    at SentimentsDriver.main(SentimentsDriver.java:33)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
x

参数: s3://sentimentproj/input/tweet.txt s3://sentimentproj/输出

情感驱动:

    import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
/*This class is responsible for running map reduce job*/
public class SentimentsDriver extends Configured implements Tool{
public int run(String[] args) throws Exception{
 if(args.length !=2) {
 System.err.println("Usage: MaxTemperatureDriver <input path> <outputpath>");
 System.exit(-1);
 }
 @SuppressWarnings("deprecation")
Job job = new Job();
 job.setJarByClass(SentimentsDriver.class);
 job.setJobName("SentimentAnalysis");
 FileInputFormat.addInputPath(job, new Path(args[0]));
 FileOutputFormat.setOutputPath(job,new Path(args[1]));
 job.setMapperClass(SentimentsMapper.class);
 job.setReducerClass(SentimentsReducer.class);
 job.setOutputKeyClass(Text.class);
 job.setOutputValueClass(IntWritable.class);
 System.exit(job.waitForCompletion(true) ? 0:1); 
 boolean success = job.waitForCompletion(true);
 return success ? 0 : 1;
 }
public static void main(String[] args) throws Exception {
 SentimentsDriver driver = new SentimentsDriver();
 int exitCode = ToolRunner.run(driver, args);
 System.exit(exitCode);
 }
}

最佳答案

问题可能出在 S3 存储桶上。确保在创建存储桶时将 S3 存储桶区域保持在集群的同一区域中。如果它会有所不同,那么它通常会抛出 Bad Request 400 错误。

关于amazon-web-services - 亚马逊 S3 错误代码 : 400 while running mr-job on EMR,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38480971/

相关文章:

amazon-web-services - 在 Amazon ElasticBeanstalk 中使用 docker 或自定义 AMI

amazon-web-services - 是否可以异步执行AWS CDK的部署?

sql - Hive 中是否有更简单的方法将不同的 JSON 键合并为一个?

hadoop - 何时使用Map Reduce作业的自定义输入格式

apache-spark - 不健康的EMR节点 “local-dirs are bad:/mnt/yarn,/mnt3/yarn”

mapreduce - 使用 MapReduce 确定输入数据中的模式

hadoop - 从节点在hadoop中可以有多个相同文件的 block 吗?

c# - 将 SimpleDB(与 SimpleSavant)与 POCO/现有实体一起使用,而不是我类上的属性

java - 使用 MapReduce 删除包含特定单词的整个句子

amazon-web-services - 使用 AWS API Gateway + Lambda/ECS 开发的微服务应该怎么讲?