java - Cassandra删除操作有时无法正常工作,删除后无法选择数据

标签 java cassandra datastax cassandra-2.0 datastax-java-driver

我有两张 table

CREATE TABLE IF NOT EXISTS QueueBucket (
    queueName   text,
    bucketId    int,
    scheduledMinute timestamp,
    scheduledTime timestamp,
    messageId   uuid,
    PRIMARY KEY ((queueName, bucketId, scheduledMinute), scheduledTime, messageId)
)  WITH compaction = { 'class' :  'LeveledCompactionStrategy'  } AND speculative_retry='NONE' ;

CREATE TABLE IF NOT EXISTS InDelivery (
    queueName       text,
    nodeId        uuid,
    dequeuedMinute    timestamp,
    messageId       uuid,
    bucketid        int,
    dequeuedTime    timestamp,
    PRIMARY KEY ((queueName, nodeId,bucketId, dequeuedMinute),dequeuedTime, messageId)
);

在代码中,我执行插入到 QueueBucket 并从批处理中删除(已记录)。但是在负载测试期间,尽管插入到 QueueBucket 可以工作,但从交付中删除有时却不起作用。为了确认这一点,请立即从未送达检查中读取,然后读取已删除的 messageId(如果 messageId 仍然存在),并打印 WARN 日志。

    queueDao.insertMsgInfo(queueName, bucketId, QueueUtils.getMinute(scheduledTime), scheduledTime, messageId);
    queuDao.deleteInDelivery(queueName, nodeId, bucketId, bucketMinute, dequeuedTime, messageId);
    if(queueServiceMetaDao.hasIndeliveryMessage(inDeliveryPK)) {
        log.warn("messageId  {} of queue {} bucket {} with node {} dequuedTime {} dequeud minute {} could not get deleted from indelivery.",
                messageId,queueName,bucketId, nodeId,QueueUtils.dateToString(dequeuedTime),QueueUtils.dateToString(bucketMinute));
        }

在 insertMsgInfo 和 deleteInDelivery 方法中,我正在重用准备好的语句。

"INSERT INTO queuebucket (queuename, bucketid , scheduledminute, scheduledtime, messageid ) VALUES ( ? , ? , ? , ? , ? );"
"DELETE FROM indelivery WHERE queuename = ? AND nodeId = ? AND bucketId=? AND dequeuedMinute=? AND dequeuedTime =? AND messageId=? ;"

在 hasIndeliveryMessage 中,我传递的值与我在 moveBackToQueueBucket 方法中删除传递数据时传递的值相同,包装到 inDeliveryPrimaryKey 中。

"SELECT messageId FROM indelivery WHERE queuename = ? AND nodeId = ? AND bucketId=? AND dequeuedMinute=? AND dequeuedTime=? AND messageId=? ;"

我不知道为什么我看到多条警告消息“无法从送达中删除”。 。请帮忙

I am using cassandra version 2.2.7 it is 6 node cassandra cluster with replication factor 5 and read and write consistency used is QUORUM.

我还浏览了链接Cassandra - deleted data still therehttps://issues.apache.org/jira/browse/CASSANDRA-7810 但这个问题很久以前就已经在2.0.11中修复了。

进一步更新按照 Cassandra - Delete not working我也运行了nodetool Repair,但问题仍然存在。 我也应该运行紧凑型吗?

进一步更新: 我不再使用批处理,我简单地插入到队列桶中并删除以进行交付,然后读取数据,但问题仍然存在

添加一些日志:

2016-07-19 20:39:42,440[http-nio-8014-exec-12]INFO  QueueDaoImpl -deleting from indelivery queueName pac01_deferred nodeid 1349d57f-28f5-37d4-9fe1-dfa14dba4a9f bucketId 382 dequeuedMinute 20160719203900000 dequeuedTime 20160719203942310 messageId cc4fb158-f61e-345b-8dcf-3f842fe52d50:
2016-07-19 20:39:42,442[http-nio-8014-exec-12]INFO  QueueDaoImpl -Reading from indelivery : queue pac01_deferred nodeId 1349d57f-28f5-37d4-9fe1-dfa14dba4a9f dequeueMinute 20160719203900000 dequeueTime 20160719203942310 messageid cc4fb158-f61e-345b-8dcf-3f842fe52d50 bucketId 382 indeliveryRow Row[cc4fb158-f61e-345b-8dcf-3f842fe52d50]
2016-07-19 20:39:42,442[http-nio-8014-exec-12]WARN  QueueImpl -messageId  cc4fb158-f61e-345b-8dcf-3f842fe52d50 of queue pac01_deferred bucket 382 with node 1349d57f-28f5-37d4-9fe1-dfa14dba4a9f dequuedTime 20160719203942310 dequeud minute 20160719203900000 could not get deleted from indelivery .

我应该尝试所有的一致性吗???

最佳答案

首先,使用 Cassandra 支持队列或类似队列的结构是一种已知的反模式。如果您的队列处理高吞吐量,您将与逻辑删除作斗争并降低查询性能。

至于您的实际问题,我以前见过使用时间戳作为键的模型发生这种情况。您如何为 dequeuedMinutedequeuedTime 创建时间戳值?

如果您自己将时间戳放在一起,那么删除它们应该很容易。但是,如果您使用 dateOf(now())Java.Util.Date 创建它们,那么您的时间戳将存储毫秒值。尽管 cqlsh 会向您隐藏这一点:

INSERT INTO InDelivery (queuename, nodeid, bucketid , dequeuedMinute, dequeuedTime, messageid )
VALUES ('test1',uuid(),2112,dateof(now()),dateof(now()),uuid());

INSERT INTO InDelivery (queuename, nodeid, bucketid , dequeuedMinute, dequeuedTime, messageid )
VALUES ('test1',a24e056a-94fa-4aee-b3a7-a8df6060091a,2112,'2016-07-19 09:57:16-0500','2016-07-19 09:57:16-0500',uuid());

SELECT queuename,nodeid,dequeuedMinute,blobasbigint(timestampasblob(dequeuedMinute)),             
dequeuedTime,blobasbigint(timestampasblob(dequeuedTime)),messageid
FROM InDelivery;

 queuename | nodeid                               | dequeuedMinute                | blobasbigint(timestampasblob(dequeuedMinute)) | dequeuedTime             | blobasbigint(timestampasblob(dequeuedTime)) | messageid
-----------|--------------------------------------+-------------------------------+-----------------------------------------------+--------------------------+--------------------------------------+---------------------------------------------
     test1 | a24e056a-94fa-4aee-b3a7-a8df6060091a | 2112 2016-07-19 09:57:16-0500 |                                 1468940236000 | 2016-07-19 09:57:16-0500 |                               1468940236000 | 7ca1f676-9034-45ba-bb3f-377ba74cc5c0
     test1 | a24e056a-94fa-4aee-b3a7-a8df6060091a | 2112 2016-07-19 09:57:16-0500 |                                 1468940236641 | 2016-07-19 09:57:16-0500 |                               1468940236641 | 9721d96e-d6f5-43a7-9ba4-18ef4d54ab8a
(2 rows)

这些时间戳看起来是一样的,对吧?但应用 blobasbigint(timestampasblob() 嵌套函数揭示了差异(000 与 641 毫秒)。

请注意,如果我更改 SELECT 以过滤 641 毫秒(blobasbigint(timestampasblob( columns) 中的最后 3 位数字)),我会得到包含毫秒的行.

SELECT queuename,nodeid,dequeuedMinute,blobasbigint(timestampasblob(dequeuedMinute)),             
dequeuedTime,blobasbigint(timestampasblob(dequeuedTime)),messageid
FROM InDelivery
WHERE queuename='test1' AND bucketid=2112 
AND nodeid=a24e056a-94fa-4aee-b3a7-a8df6060091a
AND dequeuedMinute='2016-07-19 09:57:16.641-0500';

 queuename | nodeid                               | dequeuedMinute                | blobasbigint(timestampasblob(dequeuedMinute)) | dequeuedTime             | blobasbigint(timestampasblob(dequeuedTime)) | messageid
-----------|--------------------------------------+-------------------------------+-----------------------------------------------+--------------------------+--------------------------------------+---------------------------------------------
     test1 | a24e056a-94fa-4aee-b3a7-a8df6060091a | 2112 2016-07-19 09:57:16-0500 |                                 1468940236641 | 2016-07-19 09:57:16-0500 |                               1468940236641 | 9721d96e-d6f5-43a7-9ba4-18ef4d54ab8a
(1 rows)

底线是,如果您要使用时间戳键存储毫秒,那么当您通过这些键SELECT/DELETE时,还需要包含它们。同样,如果您没有在时间戳键上存储毫秒,那么当您通过这些键SELECT/DELETE不能包含它们。 p>

关于java - Cassandra删除操作有时无法正常工作,删除后无法选择数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38453016/

相关文章:

java - 错误 : parse xml using java

java - 如何以编程方式了解 Cassandra 中表的主键

docker - Linux服务器上没有Internet的docker上的Cassandra/Scylla

apache-spark - 在不实现安全过滤器的情况下隐藏 spark 属性,使其不显示在 spark web UI 中

java - 使用 HQL(Hibernate 查询语言)转换查询 Oracle

java - Java 是 "pass-by-reference"还是 "pass-by-value"?

ubuntu - 当我输入 cqlsh 我得到连接被拒绝错误

java - 使用嵌入式 Cassandra 进行突变测试时出现 NoSuchMethodError

java - Cassandra-Java-driver : com. datastax.driver.core.exceptions.InvalidTypeException : Invalid type, 列是一个列表,但提供了类 java.lang.String

java - 为什么在这个简单的程序中 setBackground 不立即并一致地更新背景颜色