amazon-web-services - Kinesis 客户端库记录处理器故障

根据 AWS docs :

The worker invokes record processor methods using Java ExecutorService tasks. If a task fails, the worker retains control of the shard that the record processor was processing. The worker starts a new record processor task to process that shard. For more information, see Read Throttling.

根据 another page在 AWS 文档上:

The Kinesis Client Library (KCL) relies on your processRecords code to handle any exceptions that arise from processing the data records. Any exception thrown from processRecords is absorbed by the KCL. To avoid infinite retries on a recurring failure, the KCL does not resend the batch of records processed at the time of the exception. The KCL then calls processRecords for the next batch of data records without restarting the record processor. This effectively results in consumer applications observing skipped records. To prevent skipped records, handle all exceptions within processRecords appropriately.

这两个不是自相矛盾的说法吗？一个说记录处理器重新启动，另一个说跳过了分片。
当记录处理器出现故障时，KCL 究竟会做什么？ KCL 工作人员如何知道记录处理器是否发生故障？

最佳答案

根据我编写、调试和支持基于 KCL 的应用程序的经验，第二个语句更清晰/准确/有用，用于描述您应该如何考虑错误处理。

首先介绍一下背景:

KCL 记录处理旨在从多个主机运行。假设您有 3 个主机和 12 个要处理的分片 - 每个主机运行一个工作程序，并且将拥有 4 个分片的处理权。

如果在处理这些分片之一的过程中抛出异常，KCL 将吸收异常并将其视为所有记录都已处理 - 有效地“跳过”任何未处理的记录。

请记住，这是您抛出异常的代码，因此您可以在它转义到 KCL 之前处理它

当 KCL 工作器本身出现故障/停止时，这些分片将转移到另一个工作器。例如，如果您缩小到两台主机，则由第三个工作器工作的 4 个分片将转移到另外两个。

第一条语句试图(不是很清楚地)说明当 KCL 任务失败时，该工作程序实例将保持对其正在处理的分片的控制(而不是将它们转移给另一个工作程序)。

关于amazon-web-services - Kinesis 客户端库记录处理器故障，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/47056385/

amazon-web-services - Kinesis 客户端库记录处理器故障

上一篇：symfony-flex - 我应该在我的版本控制中保留 symfony.lock 吗？

下一篇：Xcode 在代码编辑器中显示 Assets 图标文件名