java - 如何在DynamoDB中获取大数据？

我需要查看 DynamoDB 中特定表中的所有项目。

我的表包含 1000 万个项目。我尝试获取所有内容，但无法将它们插入列表中，因为它太大了。我的目的是检查所有项目，看看是否可以删除它们。

最佳答案

这是示例扫描表代码。我不确定您是否有此代码。

扫描 API 不会一次性为您提供所有记录。您必须递归执行扫描，直到 LastEvaluatedKey 不为 null 才能获取表中的所有项目。您可以想象这类似于分页输出。这样您就不需要在一次扫描中处理所有项目(即 1000 万个项目)。此外，它也不会花费您(即读取容量单位)。

If the total number of scanned items exceeds the maximum data set size limit of 1 MB, the scan stops and results are returned to the user as a LastEvaluatedKey value to continue the scan in a subsequent operation. The results also include the number of items exceeding the limit. A scan can result in no table data meeting the filter criteria.

Scan API

public class ScanTable {

    public static void main(String[] args) {

        AmazonDynamoDB amazonDynamoDB = AmazonDynamoDBClientBuilder.standard()
                .withEndpointConfiguration(new EndpointConfiguration("http://localhost:8000", "us-east-1")).build();

        ScanRequest scanRequest = new ScanRequest().withTableName("Movies");

        Map<String, AttributeValue> lastKey = null;

        do {

            ScanResult scanResult = amazonDynamoDB.scan(scanRequest);

            List<Map<String, AttributeValue>> results = scanResult.getItems();

            // You can get the results here
            results.stream().forEach(System.out::println);

            lastKey = scanResult.getLastEvaluatedKey();
            scanRequest.setExclusiveStartKey(lastKey);
        } while (lastKey != null);

    }
}

不清楚:-

我了解到您想要检索所有项目并进行一些处理。但是，我不确定您为什么要插入到列表中。

如果您单独处理每个扫描结果(即1MB数据)，您可能不需要插入列表并使用堆内存。显然，无论采用哪种方法，它都需要更多内存。

关于java - 如何在DynamoDB中获取大数据？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/44150464/

java - 如何在DynamoDB中获取大数据？

上一篇：java - 如何从 Android 向服务器发送数据以及从服务器向 Android 发送数据

下一篇：java - 在 Java 中生成 VAPID key 并将其传递给 JavaScript PushManager