Just getting started with Apache Kafka (v 3.5.1), using only console for the moment.
刚刚开始使用ApacheKafka(v3.5.1),目前只使用控制台。
So, I've created a topic with 3 partitions,
所以,我创建了一个带有3个分区的主题,
kafka-topics.sh --topic t_3 --create --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1
Kafka-topics.sh--主题t_3--创建--引导--服务器本地主机:9092--分区3--复制因子1
Then created three consumers in the same group:
然后在同一组中创建了三个消费者:
kafka-console-consumer.sh --topic t_3 --bootstrap-server localhost:9092 --group group_one
Kafka-Console-Consumer er.sh--主题t_3--引导-服务器本地主机:9092--group group_one
Then created producer:
然后创建了制片人:
kafka-console-producer.sh --topic t_3 --bootstrap-server localhost:9092
Kafka-控制台-Producer.sh--主题t_3--引导-服务器本地主机:9092
But the group isn't balanced, as it was expected. When I produce messages, they all go to one consumer. When I stop one of the consumers, messages start to go to both consumers that are left, I mean every message received by both consumers.
但这一群体并不像预期的那样平衡。当我产生信息时,它们都会流向一个消费者。当我阻止其中一个消费者时,消息开始发送给剩下的两个消费者,我是说两个消费者收到的每条消息。
I thought in this case, each consumer should receive message only from its own partition, and producer must write messages to partions by round-robin. What do I do wrong?
我认为在这种情况下,每个消费者应该只从自己的分区接收消息,而生产者必须以循环的方式向分区写入消息。我做错了什么?
P.S. In Kafka log I see this phrase when adding new consumer: rebalance failed due to MemberIdRequiredException. But I can't figure out, what do I do wrong, cause all the docs and answers and videos tell, that expected behaviour with default configurations is just as what I thought it should be.
附注:在Kafka日志中,我在添加新的消费者时看到这样一句话:由于MemberIdRequiredException,重新平衡失败。但我不知道我做错了什么,因为所有的文档、答案和视频都告诉我,默认配置的预期行为与我想象的一样。
更多回答
优秀答案推荐
producer must write messages to partions by round-robin
How old are the docs / videos you're seeing? StickyPartitioner was added around v2.4, which changed how records are batched together in a producer , and only get sent to one partition, rather than having each message being round robined. This ultimately results in less network requests being spread out over the whole cluster. From a consumer perspective, this should be transparent as all brokers should be considered equal.
你看到的文档/视频有多老?StickyPartitioner是在v2.4左右添加的,它改变了记录在生产者中批量处理的方式,并且只发送到一个分区,而不是让每个消息都进行循环。这最终会导致更少的网络请求分布在整个集群上。从消费者的角度来看,这应该是透明的,因为所有经纪人都应该被视为平等的。
If data is only produced to one partition (you could check that too), then only that assigned consumer would read that (it shouldn't go to multiple partitions without multiple batches).
如果数据只生成到一个分区(您也可以检查这一点),那么只有分配的使用者才会读取该数据(它不应该在没有多个批处理的情况下访问多个分区)。
In other words, consumers in a group are rarely truly "balanced" since data is only ever written to a leader partition, and that partition could have more data sent to it than others.
换句话说,组中的消费者很少是真正“平衡”的,因为数据只被写入引导者分区,而该分区可能比其他分区发送到它的数据更多。
更多回答
我是一名优秀的程序员,十分优秀!