gpt4 book ai didi

apache-kafka - kappa-architecture 和 lambda-architecture 有什么区别

转载 作者:行者123 更新时间:2023-12-01 08:51:03 31 4
gpt4 key购买 nike

如果 Kappa 架构直接对流进行分析,而不是将数据拆分为两个流,那么在像 Kafka 这样的消息系统中,数据存储在哪里?或者它可以在数据库中进行重新计算?

单独的批处理层是否比使用流处理引擎重新计算进行批处理更快?

最佳答案

"A very simple case to consider is when the algorithms applied to the real-time data and to the historical data are identical. Then it is clearly very beneficial to use the same code base to process historical and real-time data, and therefore to implement the use-case using the Kappa architecture". "Now, the algorithms used to process historical data and real-time data are not always identical. In some cases, the batch algorithm can be optimized thanks to the fact that it has access to the complete historical dataset, and then outperform the implementation of the real-time algorithm. Here, choosing between Lambda and Kappa becomes a choice between favoring batch execution performance over code base simplicity". "Finally, there are even more complex use-cases, in which even the outputs of the real-time and batch algorithm are different. For example, a machine learning application where generation of the batch model requires so much time and resources that the best result achievable in real-time is computing and approximated updates of that model. In such cases, the batch and real-time layers cannot be merged, and the Lambda architecture must be used".



Quote

Lambda-Architecture
  • 分离批处理和流层
  • 更高的代码复杂度
  • 使用单独的批次/流实现更快的性能
  • 更好地适用于批处理和流中的不同算法
  • 使用用于批量计算的数据存储而不是数据库更便宜

  • Kappa-Architecture
  • 只有一个 Steam 处理层
  • 更容易维护,复杂度低,批处理和单一算法
  • 如果从数据库中重新计算批处理
  • ,过多的数据会很昂贵
  • 如果从数据库或 kafka 重新计算批处理
  • 的数据过多,处理速度会变慢

    关于apache-kafka - kappa-architecture 和 lambda-architecture 有什么区别,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41967295/

    31 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com