gpt4 book ai didi

apache-kafka - 即使生产者得到确认,Kafka 中也会发生消息丢失吗?

转载 作者:行者123 更新时间:2023-12-04 03:58:31 25 4
gpt4 key购买 nike

Kafka doc说:

  • Kafka relies heavily on the filesystem for storing and caching messages.
  • A modern operating system provides read-ahead and write-behind techniques that prefetch data in large block multiples and group smaller logical writes into large physical writes.
  • Modern operating systems have become increasingly aggressive in their use of main memory for disk caching. A modern OS will happily divert all free memory to disk caching with little performance penalty when the memory is reclaimed. All disk reads and writes will go through this unified cache
  • ...rather than maintain as much as possible in-memory and flush it all out to the filesystem in a panic when we run out of space, we invert that. All data is immediately written to a persistent log on the filesystem without necessarily flushing to disk. In effect this just means that it is transferred into the kernel's pagecache.”


进一步 this article说:

(3) a message is ‘committed’ when all in sync replicas have applied it to their log, and (4) any committed message will not be lost, as long as at least one in sync replica is alive.



所以即使我用 acks=all 配置生产者(这会导致生产者在所有代理提交消息后收到确认)并且生产者收到某些消息的确认,这是否意味着他们仍然有可能丢失消息,特别是如果所有代理都出现故障并且操作系统从不刷新已提交的消息消息缓存到磁盘?

最佳答案

acks=all如果主题的复制因子 > 1,仍然有可能丢失已确认的消息,但可能性很小。

例如,如果您有 3 个副本(并且所有副本都是同步的),则 acks=all ,您需要同时丢失所有 3 个代理,然后它们中的任何一个才有时间实际写入磁盘。与 acks=all ,一旦所有同步副本收到消息,就会发送确认,您可以使用 min.insync.replicas=2 确保此数字保持高位。例如。

如果您使用 rack awareness feature,您可以进一步降低出现这种情况的可能性。 (显然,经纪人实际上位于不同的机架甚至更好的数据中心)。

总而言之,使用所有这些选项,您可以充分降低丢失数据的可能性,使其不太可能发生。

关于apache-kafka - 即使生产者得到确认,Kafka 中也会发生消息丢失吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57987591/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com