java - 重新启动无异常关闭的 Kafka Streams 应用程序-6ren

java - 重新启动无异常关闭的 Kafka Streams 应用程序

转载作者：行者123 更新时间：2023-12-02 12:39:05

我正在使用 Kafka Streams v. 0.10.2.0 通过简单的处理在主题之间进行流式传输。最近，我遇到了一个问题，其中一个经纪人宕机了，kafka 流应用程序关闭并保持关闭状态，直到我手动重新启动它。尝试调试此问题时，我无法从日志中了解到底是什么导致了此问题，以下是日志摘录:

INFO [StreamThread-1] o.a.k.c.c.i.ConsumerCoordinator - Revoking previously assigned partitions [topicname-3, topicname-1, topicname-2] for group streams-group
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] partitions [[topicname-3, topicname-1, topicname-2]] revoked at the beginning of consumer rebalance.
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Closing a task's topology 0_1
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Closing a task's topology 0_2
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Closing a task's topology 0_3
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Flushing state stores of task 0_1
INFO [kafka-coordinator-heartbeat-thread | streams-group] o.a.k.c.c.i.AbstractCoordinator - Marking the coordinator 127.0.0.1:9092 dead for group streams-group
INFO [kafka-coordinator-heartbeat-thread | streams-group] o.a.k.c.c.i.AbstractCoordinator - Discovered coordinator 127.0.0.1:9092 for group streams-group.
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Flushing state stores of task 0_2
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Flushing state stores of task 0_3
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Committing consumer offsets of task 0_1
ERROR [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Failed while executing StreamTask 0_1 due to commit consumer offsets: 
org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Updating suspended tasks to contain active tasks [[0_1, 0_2, 0_3]]
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Removing all active tasks [[0_1, 0_2, 0_3]]
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Removing all standby tasks [[]]
ERROR [StreamThread-1] o.a.k.c.c.i.ConsumerCoordinator - User provided listener org.apache.kafka.streams.processor.internals.StreamThread$1 for group streams-group failed on partition revocation
INFO [StreamThread-1] o.a.k.c.c.i.AbstractCoordinator - (Re-)joining group streams-group
INFO [StreamThread-1] o.a.k.c.c.i.AbstractCoordinator - Marking the coordinator dead for group streams-group
INFO [StreamThread-1] o.a.k.c.c.i.AbstractCoordinator - Discovered coordinator for group streams-group.
INFO [StreamThread-1] o.a.k.c.c.i.AbstractCoordinator - (Re-)joining group streams-group
INFO [StreamThread-1] o.a.k.s.p.i.StreamPartitionAssignor - stream-thread [StreamThread-1] Constructed client metadata ...
INFO [StreamThread-1] o.a.k.s.p.i.StreamPartitionAssignor - stream-thread [StreamThread-1] Completed validating internal topics in partition assignor
INFO [StreamThread-1] o.a.k.s.p.i.StreamPartitionAssignor - stream-thread [StreamThread-1] Completed validating internal topics in partition assignor
INFO [StreamThread-1] o.a.k.s.p.i.StreamPartitionAssignor - stream-thread [StreamThread-1] Assigned tasks to clients as {...=[activeTasks: ([0_0, 0_4]) assignedTasks: ([0_0, 0_4]) prevActiveTasks: ([]) prevAssignedTasks: ([]) capacity: 1.0 cost: 0.2], ...=[activeTasks: ([0_1, 0_2, 0_3]) assignedTasks: ([0_1, 0_2, 0_3]) prevActiveTasks: ([]) prevAssignedTasks: ([]) capacity: 1.0 cost: 0.30000000000000004]}.
INFO [StreamThread-1] o.a.k.c.c.i.AbstractCoordinator - Successfully joined group streams-group with generation 17
INFO [StreamThread-1] o.a.k.c.c.i.ConsumerCoordinator - Setting newly assigned partitions [topicname-3, topicname-1, topicname-2] for group streams-group
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] New partitions [[topicname-3, topicname-1, topicname-2]] assigned at the end of consumer rebalance.
INFO [StreamThread-1] o.a.k.s.p.i.StreamTask - task [0_1] Initializing processor nodes of the topology
INFO [StreamThread-1] o.a.k.s.p.i.StreamTask - task [0_2] Initializing processor nodes of the topology
INFO [StreamThread-1] o.a.k.s.p.i.StreamTask - task [0_3] Initializing processor nodes of the topology
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Shutting down
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Closing a task 0_1
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Closing a task 0_2
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Closing a task 0_3
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Flushing state stores of task 0_1
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Flushing state stores of task 0_2
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Flushing state stores of task 0_3
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Closing the state manager of task 0_1
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Closing the state manager of task 0_2
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Closing the state manager of task 0_3
INFO [StreamThread-1] o.a.k.c.p.KafkaProducer - Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms.
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Removing all active tasks [[0_1, 0_2, 0_3]]
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Removing all standby tasks [[]]
INFO [StreamThread-1] o.a.k.s.p.i.StreamThread - stream-thread [StreamThread-1] Stream thread shutdown complete
WARN [StreamThread-1] o.a.k.s.p.i.StreamThread - Unexpected state transition from RUNNING to NOT_RUNNING

首先，处理花费很长时间似乎不太可能，因为它非常简单，并且应用程序运行了几个月，日志中没有类似的消息。

另外，从日志来看，kafka 流成功重新加入了组，但突然它就毫无异常(exception)地关闭了。我有两个流应用程序在不同的计算机上运行，并且在代理重新启动时同时关闭。

如何调试这个问题？它至少不应该抛出异常吗？另一个问题是，当流线程关闭时，应用程序的其余部分工作正常，因此它不会自动重新启动。我可以以某种方式捕获它并重新启动线程吗？保留策略使得消费者非常不希望陷入困境，我怎样才能使 kafka Streams 应用程序更可靠？

谢谢!

最佳答案

从日志中很难说。也许调试日志会揭示更多信息......

唯一的“盲目猜测”可能是，初始化拓扑的处理器节点期间出现错误。但如果有异常，实际上应该在日志中。这也可能是库中的错误。

关于监控您的应用程序，您有多种选择:

您可以注册一个 KafkaStreams#setUncaughtExceptionHandler() 来查看如果 StreamThread 是否会引发异常，从而导致线程死亡
您可以注册一个 KafkaStreams#setStateListener() 来查看应用是否进入 NOT_RUNNING 状态(顺便说一句:NOT_RUNNING 有一个已知问题0.10.2 和 0.11.0 中的 code> 状态 - 刚刚在 trunk 中修复:如果所有线程都已死亡，状态可能仍为 RUNNING，因此您应该监控仍在手动运行的线程数)

顺便说一句:我建议升级到包含多个重要错误修复的 0.10.2.1。

关于java - 重新启动无异常关闭的 Kafka Streams 应用程序，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/45005851/

文章推荐： kotlin - kotlin 中这个 ' ' 标记的名称是什么

文章推荐： java - 从Java中的谷歌地图API解析Json数组

文章推荐： kotlin - Kotlin 原生gradle Hello World

文章推荐： mattermost - 如何在mattermost中添加链接预览？

java - 让 CompletableFuture 异常()处理 supplyAsync() 异常
问题很简单:我正在寻找一种优雅的使用方式 CompletableFuture#exceptionally与 CompletableFuture#supplyAsync 一起.这是行不通的: priva
java - 从 XSD 生成 Java 异常/使用 JAXB2 绑定(bind)异常
对于 Web 服务，我们通常使用 maven-jaxb2-plugin 生成 java bean，并在 Spring 中使用 JAXB2 编码。我想知道如何处理 WSDL/XSD 中声明的(SOAP-
c - 当我违反数组大小限制时，为什么我没有收到 OutOfBounds 异常(如 Java 异常)或 C 中的任何其他错误？
这个问题已经有答案了: Array index out of bound behavior (10 个回答) 已关闭 8 年前。我对下面的 C 代码感到好奇 int main(){
java - 为什么 MediaPlayer.create 在类的开头初始化时会抛出 NullPointer 异常，而在 OnCreate 方法中初始化时不会抛出 NullPointer 异常？
当在类的开头使用上下文和资源初始化 MediaPlayer 对象时，它会抛出 NullPointer 异常，但是当在类的开头声明它时(因此它是 null)，然后以相同的方式初始化它在onCreate方
java - JAVA 6 中出现 SSL 异常，但 JAVA 8 中没有 SSL 异常
嘿我尝试将 java 程序连接到 REST API。使用相同的代码部分，我在 Java 6 中遇到了 Java 异常，并且在 Java 8 中运行良好。环境相同: 信任机器 unix 用户代
linux - 异常(exception)如下。 org.apache.flume.FlumeException : Unable to load source type in flume twitter analysis 异常
我正在尝试使用 Flume 和 Hive 进行 Twitter 分析。为了从 twitter 获取推文，我在 flume.conf 文件中设置了所有必需的参数(consumerKey、consumer
JavaFX 异常
我在 JavaFX 异常方面遇到一些问题。我的项目在我的 Eclipse 中运行，但现在我的 friend 也尝试访问该项目。我们已共享并直接保存到保管箱文件夹中。但他根本无法让它发挥作用。他在控制台
Jquery模糊()异常
假设我使用 blur() 事件验证了电子邮件 ID，我正在这样做: $('#email').blur(function(){ //make ajax call , check if dupli
调用回调函数时出现C#异常
我这样做是为了从 C 代码调用非托管函数。 pCallback 是一个函数指针，因此在托管端是一个委托(delegate)。 [DllImport("MyDLL.dll")] public stati
Java:异常
为什么这段代码是正确的: try { } catch(ArrayOutOfBoundsException e) {} 这是错误的: try { } catch(IOException e) {} 这段
调用dll函数后未捕获C++异常
我遇到了以下问题:有导出函数的DLL。代码示例如下:[动态链接库] __declspec(dllexport) int openDevice(int,void**) [应用] 开发者.h: __de
析构函数中的c++异常
从其他线程，我知道我们不应该在析构函数中抛出异常!但是对于下面的例子，它确实有效。这是否意味着我们只能在一个实例的析构函数中抛出异常？我们应该如何理解这个代码示例! #include using n
Java基础——异常
为什么需要异常引出 public static void main(String[
Java经典面试题汇总:异常
1. Java的异常机制 Throwable类是Java异常类型的顶层父类，一个对象只有是 Throwable 类的(直接或者间接)实例，他才是一个异常对象，才能被异常处理机制识别。JDK中内
python - “异常”对象不可调用
我是 Python 的新手，我对某种异常方法的实现有疑问。这是代码(缩写): class OurException(Exception): """User defined Exception"
Cassandra ArrayIndexOutOfBoundsException 异常
我已经创建了以下模式来表示用户和一组线程之间的关联，这些线程按他们的最后一条消息排序(用户已经阅读了哪些线程，哪些没有): CREATE TABLE table(user_id bigint, mes
Python 异常 - 捕获除预期之外的所有异常
我正在使用 Python 编写一个简单的自动化脚本，它可能会在多个位置引发异常。在他们每个人中，我都想记录一条特定的消息并退出程序。为此，我在捕获异常并处理它(执行特定的日志记录操作等)后引发 Sys
F# printfn 异常
谁能解释一下为什么这会导致错误: let xs = [| "Mary"; "Mungo"; "Midge" |] Array.iter printfn xs 虽然不是这样: Array.iter pr
安全登录后尝试访问任何页面时发生 JPA 异常
在我使用 Play! 的网站上，我有一个管理部分。所有 Admin Controller 都有一个 @With 和一个 @Check 注释。断开连接后，一切正常。连接后，每次加载页面(任何页面，无论
仅当部署在服务器上而非本地主机上时出现 Azure 异常
我尝试连接到 azure 表存储并添加一个对象。它在本地主机上工作得很好，但是在我使用的服务器上我得到以下异常及其内部异常: Exception of type 'Microsoft.Wind

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

java - 重新启动无异常关闭的 Kafka Streams 应用程序