java - 在Google Appengine中使用MapOnlyMapper时出现RetryHandler异常

标签 java google-app-engine hadoop

我有一个非常大的数据集,并且想要更新某些实体类型。我正在GoogleAppEngine中探索MapReduce库。我遵循了此处列出的示例。

https://github.com/GoogleCloudPlatform/appengine-mapreduce/tree/master/java/example/src/com/google/appengine/demos/mapreduce/entitycount

我基本上在做什么,这是在我的MapSpecification中

MapSpecification<Entity, Entity, Void> spec = new MapSpecification.Builder<>(
                new DatastoreKeyInput(query,2),
                new UrlFlattenMapper(),
                new DatastoreOutput())
                .setJobName("Flatten URLs entities")
                .build();

然后My Mapper基本上在Entity上执行操作,然后将其发出,以供DatastoreOutput编写器将其写回到数据库中。

我的问题是,实体正在更新中。我的MapperTask中也调用了endSlice。但是乔布斯还没有完成。我不断收到这些错误
[INFO] INFO: RetryHelper(28.07 ms, 1 attempts, java.util.concurrent.Executors$RunnableAdapter@7f0264e0): Attempt #1 failed [java.lang.RuntimeException: Can't serialize object: MapOnlyShardTask[context=IncrementalTaskContext[jobId=3c041e68-5041-458c-994b-290cd941f8bb, shardNumber=1, shardCount=2, lastWorkItem=Topics("jzdh"), workerCallCount=297, workerTimeMillis=42513], inputExhausted=true, isFirstSlice=false]], sleeping for 1028 ms
[INFO] Apr 26, 2016 4:39:37 PM com.google.appengine.tools.cloudstorage.RetryHelper doRetry
[INFO] INFO: RetryHelper(1.085 s, 2 attempts, java.util.concurrent.Executors$RunnableAdapter@7f0264e0): Attempt #2 failed [java.lang.RuntimeException: Can't serialize object: MapOnlyShardTask[context=IncrementalTaskContext[jobId=3c041e68-5041-458c-994b-290cd941f8bb, shardNumber=1, shardCount=2, lastWorkItem=Topics("jzdh"), workerCallCount=297, workerTimeMillis=42513], inputExhausted=true, isFirstSlice=false]], sleeping for 2435 ms
[INFO] Apr 26, 2016 4:39:37 PM com.google.appengine.tools.cloudstorage.RetryHelper doRetry
[INFO] INFO: RetryHelper(3.562 s, 3 attempts, java.util.concurrent.Executors$RunnableAdapter@6d7fcd47): Attempt #3 failed [java.lang.RuntimeException: Can't serialize object: MapOnlyShardTask[context=IncrementalTaskContext[jobId=3c041e68-5041-458c-994b-290cd941f8bb, shardNumber=0, shardCount=2, lastWorkItem=Topics("jz63"), workerCallCount=289, workerTimeMillis=41536], inputExhausted=true, isFirstSlice=false]], sleeping for 3421 ms
[INFO] Apr 26, 2016 4:39:39 PM com.google.appengine.tools.cloudstorage.RetryHelper doRetry
[INFO] INFO: RetryHelper(3.567 s, 3 attempts, java.util.concurrent.Executors$RunnableAdapter@7f0264e0): Attempt #3 failed [java.lang.RuntimeException: Can't serialize object: MapOnlyShardTask[context=IncrementalTaskContext[jobId=3c041e68-5041-458c-994b-290cd941f8bb, shardNumber=1, shardCount=2, lastWorkItem=Topics("jzdh"), workerCallCount=297, workerTimeMillis=42513], inputExhausted=true, isFirstSlice=false]], sleeping for 3340 ms
[INFO] Apr 26, 2016 4:39:41 PM com.google.appengine.tools.cloudstorage.RetryHelper doRetry
[INFO] INFO: RetryHelper(7.015 s, 4 attempts, java.util.concurrent.Executors$RunnableAdapter@6d7fcd47): Attempt #4 failed [java.lang.RuntimeException: Can't serialize object: MapOnlyShardTask[context=IncrementalTaskContext[jobId=3c041e68-5041-458c-994b-290cd941f8bb, shardNumber=0, shardCount=2, lastWorkItem=Topics("jz63"), workerCallCount=289, workerTimeMillis=41536], inputExhausted=true, isFirstSlice=false]], sleeping for 6941 ms
[INFO] Apr 26, 2016 4:39:42 PM com.google.appengine.tools.cloudstorage.RetryHelper doRetry

我无法解决这个问题,对于我可以做的任何帮助或指示,将不胜感激。

最佳答案

就我而言,罪魁祸首是我在 map 作业中使用的一个小的数据存储字段。我暂时把这个问题放在了前面,问题就解决了,

关于java - 在Google Appengine中使用MapOnlyMapper时出现RetryHandler异常,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36863550/

相关文章:

apache-spark - 如何在单机上创建集群式hadoop环境?

java - Libgdx 加载 blender 模型,透明度不正确

java - Java 的 switch 在底层是如何工作的?

google-app-engine - 交易和实体组

google-app-engine - App Engine 实例的 GCS 入口和导出费用

java - 如何从设备读取和写入超大数据

hadoop - 自动故障转移在 Hadoop 中不起作用

Java驱动程序?

用于双向 I/O 的 Java 数据对象

python - 在用 Python (Google App Engine) 制作博客时使用 Web2Py?这是不是一个好主意?