cassandra - Cassandra 支持哪些类型的墓碑?

标签 cassandra cassandra-2.0 tombstone

Cassandra(版本 2)支持哪些类型的逻辑删除?根据this它支持的文章(用 CQL 术语):

  • 一行的特定列。
  • 静态列。
  • 分区键的所有行。

我是否错过了任何其他类型的墓碑?删除特定 (CQL) 行?是否有任何特殊的墓碑来支持删除集群键或类似的范围?在规划架构时了解此信息非常有用,以避免出现太多逻辑删除。

最佳答案

墓碑是放置在行中的标记,用于指示删除。它们可以存在于不同的位置,在一列或一系列列中,或者在整行中。下面的例子展示了普通类型的墓碑(这里不涉及范围类型)。

在规划架构时,您可以根据正在执行的查询类型对表进行建模,而不是使用一个表,您可能会发现多个表之间存在重复的数据。这些表经过优化以服务传入的读取和写入。下面的链接应该为您提供有关 Cassandra 数据建模的一些良好背景知识:

http://www.datastax.com/resources/data-modeling

我的示例:我创建了一个表并插入了一些数据,然后使用nodetoollush生成一些sstables。使用 sstable2json 工具您可以看到删除的行,如果是整行,它看起来与单列略有不同,但本质上它仍然只是一个标记:

这是包含所有数据的表格:

$ ~/dse-4.5.1/resources/cassandra/bin/sstable2json ./dse-data/results/ts1/results-ts1-jb-1-Data.db 
[
{"key": "3136","columns": [["","",1417814256390000], ["col2","26",1417814256390000], ["col3","36",1417814256390000], ["id","id16",1417814256390000]]},
{"key": "3133","columns": [["","",1417814218246000], ["col2","23",1417814218246000], ["col3","33",1417814218246000], ["id","id13",1417814218246000]]},
{"key": "3135","columns": [["","",1417814244766000], ["col2","25",1417814244766000], ["col3","35",1417814244766000], ["id","id15",1417814244766000]]},
{"key": "3134","columns": [["","",1417814230711000], ["col2","24",1417814230711000], ["col3","34",1417814230711000], ["id","id14",1417814230711000]]},
{"key": "3132","columns": [["","",1417814207910000], ["col2","22",1417814207910000], ["col3","32",1417814207910000], ["id","id12",1417814207910000]]},
{"key": "3131","columns": [["","",1417814197094000], ["col2","21",1417814197094000], ["col3","31",1417814197094000], ["id","id11",1417814197094000]]},
{"key": "31","columns": [["","",1417814185270000], ["col2","2",1417814185270000], ["col3","3",1417814185270000], ["id","id1",1417814185270000]]}
]

这是 cqlsh 中的第一个删除:

cqlsh:results> delete from ts1 WHERE col1 = '1';
cqlsh:results> delete id from ts1 WHERE col1 = '11';

这是刷新后生成的 sstable:

[datastax@DSE3 ~]$ ~/dse-4.5.1/resources/cassandra/bin/sstable2json ./dse-data/results/ts1/results-ts1-jb-2-Data.db 
[
{"key": "3131","columns": [["id","54822130",1417814320400000,"d"]]},
{"key": "31","metadata": {"deletionInfo": {"markedForDeleteAt":1417814302304000,"localDeletionTime":1417814302}},"columns": []}
]

这是 cqlsh 中的下一个删除:

cqlsh:results> delete col2 from ts1 WHERE col1 = '12';

这是刷新后生成的 sstable:

[datastax@DSE3 ~]$ ~/dse-4.5.1/resources/cassandra/bin/sstable2json ./dse-data/results/ts1/results-ts1-jb-3-Data.db 
[
{"key": "3132","columns": [["col2","5482220b",1417814539434000,"d"]]}
]

当压缩发生时,所有这些 sstable 都会合并成一个 sstable,然后删除的行仍然存在,但标记为删除,运行压缩后我们可以再次看到这一点(查找 d带有时间戳的标志):

[datastax@DSE3 ~]$ ./dse-4.5.1/bin/nodetool compact
[datastax@DSE3 ~]$ ~/dse-4.5.1/resources/cassandra/bin/sstable2json ./dse-data/results/ts1/results-ts1-jb-4-Data.db 
[
{"key": "3136","columns": [["","",1417814256390000], ["col2","26",1417814256390000], ["col3","36",1417814256390000], ["id","id16",1417814256390000]]},
{"key": "3133","columns": [["","",1417814218246000], ["col2","23",1417814218246000], ["col3","33",1417814218246000], ["id","id13",1417814218246000]]},
{"key": "3135","columns": [["","",1417814244766000], ["col2","25",1417814244766000], ["col3","35",1417814244766000], ["id","id15",1417814244766000]]},
{"key": "3134","columns": [["","",1417814230711000], ["col2","24",1417814230711000], ["col3","34",1417814230711000], ["id","id14",1417814230711000]]},
{"key": "3132","columns": [["","",1417814207910000], ["col2","5482220b",1417814539434000,"d"], ["col3","32",1417814207910000], ["id","id12",1417814207910000]]},
{"key": "3131","columns": [["","",1417814197094000], ["col2","21",1417814197094000], ["col3","31",1417814197094000], ["id","54822130",1417814320400000,"d"]]},
{"key": "31","metadata": {"deletionInfo": {"markedForDeleteAt":1417814302304000,"localDeletionTime":1417814302}},"columns": []}
]

现在这个表将保持这样,直到我们到达 gc_grace_seconds ,然后在下一次压缩时,行实际上会消失,观察我们删除 gc_grace_seconds 然后运行压缩:

cqlsh> ALTER TABLE results.ts1 WITH gc_grace_seconds=500;
cqlsh> exit
[datastax@DSE3 ~]$ ./dse-4.5.1/bin/nodetool compact results;

[datastax@DSE3 ~]$ ./dse-4.5.1/resources/cassandra/bin/sstable2json ./dse-data/results/ts1/results-ts1-jb-5-Data.db 
[
{"key": "3136","columns": [["","",1417814256390000], ["col2","26",1417814256390000], ["col3","36",1417814256390000], ["id","id16",1417814256390000]]},
{"key": "3133","columns": [["","",1417814218246000], ["col2","23",1417814218246000], ["col3","33",1417814218246000], ["id","id13",1417814218246000]]},
{"key": "3135","columns": [["","",1417814244766000], ["col2","25",1417814244766000], ["col3","35",1417814244766000], ["id","id15",1417814244766000]]},
{"key": "3134","columns": [["","",1417814230711000], ["col2","24",1417814230711000], ["col3","34",1417814230711000], ["id","id14",1417814230711000]]},
{"key": "3132","columns": [["","",1417814207910000], ["col3","32",1417814207910000], ["id","id12",1417814207910000]]},
{"key": "3131","columns": [["","",1417814197094000], ["col2","21",1417814197094000], ["col3","31",1417814197094000]]}
]

请注意键 31 的行以及键 3132id 的行中的 col1 已经消失> 在带有键 3131

的行中

为了清晰起见,我的表架构:

cqlsh:results> DESCRIBE TABLE ts1 ;

CREATE TABLE ts1 (
  col1 text,
  col2 text,
  col3 text,
  id text,
  PRIMARY KEY ((col1))
) WITH
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.100000 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor'};

作为脚注,sstable2json 输出中的逻辑删除标记如下:

e - TTL 已过期

d - 删除的值(墓碑)

t - 删除的值范围(范围逻辑删除)

关于cassandra - Cassandra 支持哪些类型的墓碑?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27776337/

相关文章:

cassandra - 记录 cassandra 3.4.4 中的所有查询

hadoop - Hive查询针对Cassandra columnFamily执行时返回null

cassandra - 如何列出键空间中的列族?

查询现有数据时 Cassandra ReadTimeout

apache-kafka - Kafka消息聚合

违反了Cassandra Tombstoning警告和失败阈值

java - Cassandra Client API 与 App Engine Datastore API 最相似?

cassandra - cassandra 节点上的数据大小不均匀

cassandra c# datastax 客户端插入 blob

cassandra - 在 Cassandra 中检索 "tombstoned"记录