gpt4 book ai didi

Cassandra Batches 如果不存在条件

转载 作者:搜寻专家 更新时间:2023-10-30 23:03:07 24 4
gpt4 key购买 nike

当我仅将一批插入发送到一个表时,每行作为唯一键,条件为 if not exists,即使其中一行存在,也会出现问题。

我需要按行而不是整批插入批处理。假设我有一个表“users”,只有一列“user_name”并且包含行“jhon”,现在我正在尝试导入新用户:

BEGIN BATCH
INSERT INTO "users" ("user_name") VALUES ("jhon") IF NOT EXISTS;
INSERT INTO "users" ("user_name") VALUES ("mandy") IF NOT EXISTS;
APPLY BATCH;

它不会插入“mandy”,因为“jhon”存在,我能做些什么来隔离它们?

我有很多行要插入大约 100-200K,所以我需要使用批处理。

谢谢!

最佳答案

首先:您描述的内容已记录为预期行为:

In Cassandra 2.0.6 and later, you can batch conditional updates introduced as lightweight transactions in Cassandra 2.0. Only updates made to the same partition can be included in the batch because the underlying Paxos implementation works at the granularity of the partition. You can group updates that have conditions with those that do not, but when a single statement in a batch uses a condition, the entire batch is committed using a single Paxos proposal, as if all of the conditions contained in the batch apply.

这基本上证实了:你的更新是针对不同分区的,所以只会使用一个 Paxos 提案,这意味着整个批处理都会成功,或者一个都不会。

也就是说,对于 Cassandra,批处理并不意味着加速和批量加载 - 它们旨在创建伪原子逻辑操作。来自 http://docs.datastax.com/en/cql/3.1/cql/cql_using/useBatch.html :

Batches are often mistakenly used in an attempt to optimize performance. Unlogged batches require the coordinator to manage inserts, which can place a heavy load on the coordinator node. If other nodes own partition keys, the coordinator node needs to deal with a network hop, resulting in inefficient delivery. Use unlogged batches when making updates to the same partition key.

The coordinator node might also need to work hard to process a logged batch while maintaining consistency between tables. For example, upon receiving a batch, the coordinator node sends batch logs to two other nodes. In the event of a coordinator failure, the other nodes retry the batch. The entire cluster is affected. Use a logged batch to synchronize tables, as shown in this example:

在您的架构中,每个 INSERT 都指向一个不同的分区,这会给您的协调器增加很多负载。

您可以使用具有异步执行功能的客户端运行 200k 插入,并且它们运行得非常快 - 可能与您在批处理中看到的一样快(或更快)。

关于Cassandra Batches 如果不存在条件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29909050/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com