hbase - AWS DynamoDB 与 HBase

标签 hbase amazon-dynamodb

在过去的六个月里,我一直在使用 HBase,并且通过 Amazon 了解了 DynamoDB。维护明智的发电机数据库看起来更容易处理,因为它由亚马逊负责。但是是否从 hbase 切换到 dynamo db 对我来说是一个问题。

除了维护集群之外,我找不到令人满意的理由从 hbase 切换到 dynamo db。

有人可以分享对此的想法。

最佳答案

您必须从根本上寻找您的需求,DynamoDB 以最少的维护工作和极具吸引力的财务成本提供了出色的可扩展性和性能。但是,就可以存储的内容(大小和数据类型)而言,Apache HBase 更加灵活。

另一个非常重要的评估点是哪种数据模型(列宽或键值)更适合您的用例。

Apache HBase 为您提供了非常灵活的行键数据类型的选项,而 DynamoDB 仅允许主键属性的标量类型。另一方面,DynamoDB 提供了非常容易创建和维护二级索引的方法,而您必须在 Apache HBase 中手动执行这些操作。

以下链接中的更多信息:
http://d0.awsstatic.com/whitepapers/AWS_Comparing_the_Use_of_DynamoDB_and_HBase_for_NoSQL.pdf

以下是关键点的摘要:

In summary, both Amazon DynamoDB and Apache HBase define data models that allow efficient storage of data to optimize query performance. Amazon DynamoDB imposes a restriction on its item size to allow efficient processing and reduce costs.

Apache HBase uses the concept of column families to provide data locality for more efficient read operations.

Amazon DynamoDB supports both scalar and multi-valued sets to accommodate a wide range of unstructured datasets. Similarly, Apache HBase stores its key/value pairs as arbitrary arrays of bytes, giving it the flexibility to store any data type.

Amazon DynamoDB supports built-in secondary indexes and automatically updates and synchronizes all indexes with their parent tables. With Apache HBase, you can implement and manage custom secondary indexes yourself.

From a data model perspective, you can choose Amazon DynamoDB if your item size is relatively small. Although Amazon DynamoDB provides a number of options to overcome row size restrictions, Apache HBase is better equipped to handle large complex payloads with minimal restrictions.

Throughput Model

Although read and write requirements are specified at table creation time, Amazon DynamoDB lets you increase or decrease the provisioned throughput to accommodate load with no downtime.

In Apache HBase, the number of nodes in a cluster can be driven by the required throughput for reads and/or writes.

Consistency Model

Amazon DynamoDB lets you specify the desired consistency characteristics for each read request within an application. You can specify whether a read is eventually consistent or strongly consistent.

The eventual consistency option is the default in Amazon DynamoDB and maximizes the read throughput. However, an eventually consistent read might not always reflect the results of a recently completed write. Consistency across all copies of data is usually eached within a second.

Apache HBase reads and writes are strongly consistent. This means that all reads and writes to a single row in Apache HBase are atomic. Each concurrent reader and writer can make safe assumptions about the state of a row. Multi-versioning and time stamping in Apache HBase contribute to its strongly consistent model.

Transaction Model

Neither Amazon DynamoDB nor Apache HBase support multi-item/cross-row or crosstable transactions due to performance considerations. However, both databases provide batch operations for reading and writing multiple items/rows across multiple tables with no transaction guarantees.

Table Operations

One key difference between the two databases is the flexible provisioned throughput model of Amazon DynamoDB. The ability to dial up capacity when you need it and dial it back down when you are done is useful for processing variable workloads with unpredictable peaks.

For workloads that need high update rates to perform data aggregations or maintain counters, Apache HBase is a good choice. This is because Apache HBase supports a multi-version concurrency control mechanism, which contributes to its strongly consistent reads and writes. Amazon DynamoDB gives you the flexibility to specify whether you want your read request to be eventually consistent or strongly consistent depending on your specific workload. reached within a second.



来源:
http://d0.awsstatic.com/whitepapers/AWS_Comparing_the_Use_of_DynamoDB_and_HBase_for_NoSQL.pdf

关于hbase - AWS DynamoDB 与 HBase,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10908531/

相关文章:

hadoop - org.apache.hadoop.hbase.TableNotFoundException : SYSTEM. 目录异常与凤凰 4.5.2

Hbase多个部分rowkey扫描

hadoop - Hbase 和 pig 中的错误。错误 2998 : Unhandled internal error

python - 使用Happybase扫描远程hbase表时,出现 'Tsocket read 0 bytes Error'

java - 使用Java获取Hbase中所有行的所有值

node.js - AWS Lambda Node Js - 如果存在则递增值,否则添加元素

swift - 将 AWS DynamoDB 代码更新到 Swift 3 会导致错误

java - 如何从 lambda 连接到 dynamodb

java - DynamoDBMappingException : no RANGE key value present

amazon-dynamodb - DynamoDB 跨主索引和全局二级索引的键唯一性