spring-boot - 即使在发送 SIGTERM 之后，Kubernetes 也会将流量发送到 pod-6ren

spring-boot - 即使在发送 SIGTERM 之后，Kubernetes 也会将流量发送到 pod

转载作者：行者123 更新时间：2023-12-02 11:53:42

我有一个配置了正常关闭的 SpringBoot 项目。部署在 k8s 1.12.7 下面是日志，

2019-07-20 10:23:16.180 INFO [service,,,] 1 --- [ Thread-7] com.jay.util.GracefulShutdown : Received shutdown event
2019-07-20 10:23:16.180 INFO [service,,,] 1 --- [ Thread-7] com.jay.util.GracefulShutdown : Waiting for 30s to finish
2019-07-20 10:23:16.273 INFO [service,fd964ebaa631a860,75a07c123397e4ff,false] 1 --- [io-8080-exec-10] com.jay.resource.ProductResource : GET /products?id=59
2019-07-20 10:23:16.374 INFO [service,9a569ecd8c448e98,00bc11ef2776d7fb,false] 1 --- [nio-8080-exec-1] com.jay.resource.ProductResource : GET /products?id=68
...
2019-07-20 10:23:33.711 INFO [service,1532d6298acce718,08cfb8085553b02e,false] 1 --- [nio-8080-exec-9] com.jay.resource.ProductResource : GET /products?id=209
2019-07-20 10:23:46.181 INFO [service,,,] 1 --- [ Thread-7] com.jay.util.GracefulShutdown : Resumed after hibernation
2019-07-20 10:23:46.216 INFO [service,,,] 1 --- [ Thread-7] o.s.s.concurrent.ThreadPoolTaskExecutor : Shutting down ExecutorService 'applicationTaskExecutor'

应用程序已在 10:23:16.180 收到来自 Kubernetes 的 SIGTERM。根据 Termination of Pods point#5 表示终止 pod 已从服务的端点列表中删除，但它转发请求 17 秒(直到 10:23:33.711)是矛盾的) 在发送 SIGTERM 信号后。是否缺少任何配置？

Dockerfile

FROM openjdk:8-jre-slim
MAINTAINER Jay

RUN apt update && apt install -y curl libtcnative-1 gcc && apt clean

ADD build/libs/sample-service.jar /

CMD ["java", "-jar" , "sample-service.jar"]

优雅关机

// https://github.com/spring-projects/spring-boot/issues/4657
class GracefulShutdown(val waitTime: Long, val timeout: Long) : TomcatConnectorCustomizer, ApplicationListener<ContextClosedEvent> {

    @Volatile
    private var connector: Connector? = null

    override fun customize(connector: Connector) {
        this.connector = connector
    }

    override fun onApplicationEvent(event: ContextClosedEvent) {

        log.info("Received shutdown event")

        val executor = this.connector?.protocolHandler?.executor
        if (executor is ThreadPoolExecutor) {
            try {
                val threadPoolExecutor: ThreadPoolExecutor = executor

                log.info("Waiting for ${waitTime}s to finish")
                hibernate(waitTime * 1000)

                log.info("Resumed after hibernation")
                this.connector?.pause()

                threadPoolExecutor.shutdown()
                if (!threadPoolExecutor.awaitTermination(timeout, TimeUnit.SECONDS)) {
                    log.warn("Tomcat thread pool did not shut down gracefully within $timeout seconds. Proceeding with forceful shutdown")

                    threadPoolExecutor.shutdownNow()

                    if (!threadPoolExecutor.awaitTermination(timeout, TimeUnit.SECONDS)) {
                        log.error("Tomcat thread pool did not terminate")
                    }
                }
            } catch (ex: InterruptedException) {
                log.info("Interrupted")
                Thread.currentThread().interrupt()
            }
        }else
            this.connector?.pause()
    }

    private fun hibernate(time: Long){
        try {
            Thread.sleep(time)
        }catch (ex: Exception){}
    }

    companion object {
        private val log = LoggerFactory.getLogger(GracefulShutdown::class.java)
    }
}
@Configuration
class GracefulShutdownConfig(@Value("\${app.shutdown.graceful.wait-time:30}") val waitTime: Long,
                             @Value("\${app.shutdown.graceful.timeout:30}") val timeout: Long) {

    companion object {
        private val log = LoggerFactory.getLogger(GracefulShutdownConfig::class.java)
    }

    @Bean
    fun gracefulShutdown(): GracefulShutdown {

        return GracefulShutdown(waitTime, timeout)
    }

    @Bean
    fun webServerFactory(gracefulShutdown: GracefulShutdown): ConfigurableServletWebServerFactory {

        log.info("GracefulShutdown configured with wait: ${waitTime}s and timeout: ${timeout}s")

        val factory = TomcatServletWebServerFactory()
        factory.addConnectorCustomizers(gracefulShutdown)
        return factory
    }
}

部署文件

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    k8s-app: service
  name: service
spec:
  progressDeadlineSeconds: 420
  replicas: 1
  revisionHistoryLimit: 1
  selector:
    matchLabels:
      k8s-app: service
  strategy:
    rollingUpdate:
      maxSurge: 2
      maxUnavailable: 0
    type: RollingUpdate
  template:
    metadata:
      labels:
        k8s-app: service
    spec:
      terminationGracePeriodSeconds: 60
      containers:
      - env:
        - name: SPRING_PROFILES_ACTIVE
          value: dev
        image: service:2
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 20
          httpGet:
            path: /actuator/health
            port: 8080
          initialDelaySeconds: 60
          periodSeconds: 30
          timeoutSeconds: 5
        name: service
        ports:
        - containerPort: 8080
          protocol: TCP
        readinessProbe:
          failureThreshold: 60
          httpGet:
            path: /actuator/health
            port: 8080
          initialDelaySeconds: 100
          periodSeconds: 10
          timeoutSeconds: 5

更新:

添加了自定义健康检查端点

@RestControllerEndpoint(id = "live")
@Component
class LiveEndpoint {

    companion object {
        private val log = LoggerFactory.getLogger(LiveEndpoint::class.java)
    }

    @Autowired
    private lateinit var gracefulShutdownStatus: GracefulShutdownStatus

    @GetMapping
    fun live(): ResponseEntity<Any> {

        val status = if(gracefulShutdownStatus.isTerminating())
            HttpStatus.INTERNAL_SERVER_ERROR.value()
        else
            HttpStatus.OK.value()

        log.info("Status: $status")
        return ResponseEntity.status(status).build()
    }
}

更改了livenessProbe，

  livenessProbe:
    httpGet:
      path: /actuator/live
      port: 8080
    initialDelaySeconds: 100
    periodSeconds: 5
    timeoutSeconds: 5
    failureThreshold: 3

这里是修改后的日志，

2019-07-21 14:13:01.431  INFO [service,9b65b26907f2cf8f,9b65b26907f2cf8f,false] 1 --- [nio-8080-exec-2] com.jay.util.LiveEndpoint          : Status: 200
2019-07-21 14:13:01.444  INFO [service,3da259976f9c286c,64b0d5973fddd577,false] 1 --- [nio-8080-exec-3] com.jay.resource.ProductResource   : GET /products?id=52
2019-07-21 14:13:01.609  INFO [service,,,] 1 --- [       Thread-7] com.jay.util.GracefulShutdown      : Received shutdown event
2019-07-21 14:13:01.610  INFO [service,,,] 1 --- [       Thread-7] com.jay.util.GracefulShutdown      : Waiting for 30s to finish
...
2019-07-21 14:13:06.431  INFO [service,002c0da2133cf3b0,002c0da2133cf3b0,false] 1 --- [nio-8080-exec-3] com.jay.util.LiveEndpoint          : Status: 500
2019-07-21 14:13:06.433  INFO [service,072abbd7275103ce,d1ead06b4abf2a34,false] 1 --- [nio-8080-exec-4] com.jay.resource.ProductResource   : GET /products?id=96
...
2019-07-21 14:13:11.431  INFO [service,35aa09a8aea64ae6,35aa09a8aea64ae6,false] 1 --- [io-8080-exec-10] com.jay.util.LiveEndpoint          : Status: 500
2019-07-21 14:13:11.508  INFO [service,a78c924f75538a50,0314f77f21076313,false] 1 --- [nio-8080-exec-2] com.jay.resource.ProductResource   : GET /products?id=110
...
2019-07-21 14:13:16.431  INFO [service,38a940dfda03956b,38a940dfda03956b,false] 1 --- [nio-8080-exec-9] com.jay.util.LiveEndpoint          : Status: 500
2019-07-21 14:13:16.593  INFO [service,d76e81012934805f,b61cb062154bb7f0,false] 1 --- [io-8080-exec-10] com.jay.resource.ProductResource   : GET /products?id=152
...
2019-07-21 14:13:29.634  INFO [service,38a32a20358a7cc4,2029de1ed90e9539,false] 1 --- [nio-8080-exec-6] com.jay.resource.ProductResource   : GET /products?id=191
2019-07-21 14:13:31.610  INFO [service,,,] 1 --- [       Thread-7] com.jay.util.GracefulShutdown      : Resumed after hibernation
2019-07-21 14:13:31.692  INFO [service,,,] 1 --- [       Thread-7] o.s.s.concurrent.ThreadPoolTaskExecutor  : Shutting down ExecutorService 'applicationTaskExecutor'

对于 3 次失败的 livenessProbe，kubernetes 在 liveness 失败后为流量提供服务 13 秒，即从 14:13:16.431 到 14:13: 29.634.

更新 2:事件顺序(感谢 Eamonn McEvoy)

seconds | healthy | events
   0    |    ✔    |   * liveness probe healthy
   1    |    ✔    |   - SIGTERM
   2    |    ✔    |   
   3    |    ✔    |   
   4    |    ✔    |   
   5    |    ✔    |   * liveness probe unhealthy (1/3)
   6    |    ✔    |   
   7    |    ✔    |   
   8    |    ✔    |   
   9    |    ✔    |   
   10   |    ✔    |   * liveness probe unhealthy (2/3)
   11   |    ✔    |   
   12   |    ✔    |   
   13   |    ✔    |   
   14   |    ✔    |   
   15   |    ✘    |   * liveness probe unhealthy (3/3)
   ..   |    ✔    |   * traffic is served       
   28   |    ✔    |   * traffic is served
   29   |    ✘    |   * pod restarts

最佳答案

SIGTERM 不会立即将 pod 置于终止状态。您可以在日志中看到您的应用程序在 10:23:16.180 开始正常关闭，并且需要超过 20 秒才能完成。此时容器停止，pod可以进入terminating状态。

就 kubernetes 而言，pod 在正常关闭期间看起来没问题。您需要在部署中添加一个 liveness probe；当它变得不健康时，流量将停止。

livenessProbe:
  httpGet:
    path: /actuator/health
    port: 8080
  initialDelaySeconds: 100
  periodSeconds: 10
  timeoutSeconds: 5

更新:

这是因为您的失败阈值为 3，所以您在 sigterm 后最多允许流量 15 秒；

例如

seconds | healthy | events
   0    |    ✔    |   * liveness probe healthy
   1    |    ✔    |   - SIGTERM
   2    |    ✔    |   
   3    |    ✔    |   
   4    |    ✔    |   
   5    |    ✔    |   * liveness probe issued
   6    |    ✔    |       .
   7    |    ✔    |       .
   8    |    ✔    |       .
   9    |    ✔    |       .
   10   |    ✔    |   * liveness probe timeout - unhealthy (1/3)
   11   |    ✔    |   
   12   |    ✔    |   
   13   |    ✔    |   
   14   |    ✔    |   
   15   |    ✔    |   * liveness probe issued
   16   |    ✔    |       .
   17   |    ✔    |       .
   18   |    ✔    |       .
   19   |    ✔    |       .
   20   |    ✔    |   * liveness probe timeout - unhealthy (2/3)
   21   |    ✔    |   
   22   |    ✔    |   
   23   |    ✔    |   
   24   |    ✔    |   
   25   |    ✔    |   * liveness probe issued
   26   |    ✔    |       .
   27   |    ✔    |       .
   28   |    ✔    |       .
   29   |    ✔    |       .
   30   |    ✘    |   * liveness probe timeout - unhealthy (3/3)
        |         |   * pod restarts

这是假设端点在正常关闭期间返回不健康的响应。由于您有 timeoutSeconds: 5，如果探测只是超时，这将花费更长的时间，在发出 liveness 探测请求和接收其响应之间有 5 秒的延迟。可能是容器在达到事件阈值之前实际上已经死亡，而您仍然看到原始行为

关于spring-boot - 即使在发送 SIGTERM 之后，Kubernetes 也会将流量发送到 pod，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57122081/

文章推荐： kubernetes - Kubernetes中的服务网格需求

文章推荐： kubernetes - 部署到 Kubernetes (GKE) 时无法访问 SFTP 服务器

文章推荐： android - 如何修复 MainActivity 没有导航 Controller ？

spring - spring boot、spring BOM、spring IO应该用哪一个？
我尝试阅读有关 Spring BOM、Spring Boot 和 Spring IO 的文档。但是没有说明，我们应该如何一起使用它们？在我的项目中，我们已经有了自己的 Parent POM ，所以
spring - 从与 Spring 同时加载的非 spring 组件访问 Spring 上下文
我正在开发的很酷的企业应用程序正在转向 Spring。这对所有团队来说都是非常酷和令人兴奋的练习，但也是一个巨大的压力源。我们所做的是逐渐将遗留组件移至 Spring 上下文。现在我们有一个 huuu
spring - spring 调度程序和 spring 批处理项目编写器的事务问题
我正在尝试使用 @Scheduled 运行 Spring 批处理作业注释如下: @Scheduled(cron = "* * * * * ?") public void launchMessageDi
spring - Spring 应用程序上下文和 Spring 容器有什么区别？
我对这两个概念有点困惑。阅读 Spring 文档，我发现，例如。 bean 工厂是 Spring 容器。我还读到“ApplicationContext 是 BeanFactory 的完整超集”。但两者
spring - 如何在不同的 Spring 上下文之间共享 Spring bean？
我们有一个使用 Spring BlazeDS 集成的应用程序。到目前为止，我们一直在使用 Spring 和 Flex，它运行良好。我们现在还需要添加一些 Spring MVC Controller 。
spring - 如何将自定义 Spring 模式类型与传统 Spring 模式类型混合和匹配？
假设我有一个类(class) Person带属性name和 age ，它可以像这样用 Spring 配置: 我想要一个自定义的 Spring 模式元素，这很容易做到，允许我在我的 Sp
spring - 如何在 Spring 或 Spring Boot 中以编程方式在 Spring Mongo 数据中创建复合索引？
如何在 Java 中以编程方式使用 Spring Data 创建 MongoDB 复合索引？使用 MongoTemplate 我可以创建一个这样的索引:mongoTemplate.indexOps(
spring-batch - spring 批处理流作业与 spring 组合任务
我想使用 spring-complex-task 执行我的应用程序，并且我已经构建了复杂的 spring-batch Flow Jobs，它执行得非常好。你能解释一下spring批处理流作业与spr
spring - 在非 spring 应用程序中使用 spring 应用程序作为库
我实现了 spring-boot 应用程序，现在我想将它用作非 spring 应用程序的库。如何初始化 lib 类，以便 Autowiring 的依赖项按预期工作？显然，如果我使用“new”创建类实
spring-boot - Spring 云安全与 Spring 安全
我刚开始学习 spring cloud security，我有一个基本问题。它与 Spring Security 有何不同？我们是否需要在 spring boot 上构建我们的应用程序才能使用 spr
spring - Spring 和 Spring Boot 的区别
有很多人建议我使用 Spring Boot 而不是 Spring 来开发 REST Web 服务。我想知道这两者到底有什么区别？最佳答案总之 Spring Boot 减少了编写大量配置和样板代码的
spring - Maven : Spring 4 + Spring Security
您能向我解释一下如何使用 Spring 正确构建 Web 应用程序吗？我知道 Spring 框架的最新版本是 4.0.0.RELEASE，但是 Spring Security 的最新版本是 3.2.0
spring - 打印所有加载的 Spring bean - Spring Boot
我如何才能知道作为 Spring Boot 应用程序的一部分加载的所有 bean 的名称？我想在 main 方法中有一些代码来打印服务器启动后加载的 bean 的详细信息。最佳答案如spring-
spring - 寻找正确的方法 : Spring Social + Spring RESTful API + Spring WebApp + Mobile Clients
我有一个使用 Spring 3.1 构建的 RESTful API，也使用 Spring Security。我有一个 Web 应用程序，也是一个 Spring 3.1 MVC 应用程序。我计划让移动客
spring - 哪个版本的 Spring AMQP 和 Spring Rabbit 与 Spring 5 兼容？
升级到 Spring 5 后，我在 Spring Rabbit 和 Spring AMQP 中遇到错误。两者现在都设置为 1.5.6.RELEASE 有谁知道哪些版本应该与 Spring 5 兼容？
Spring Framework、Spring Security - 可以在没有 Spring Framework 的情况下使用 Spring Security 吗？
我现在已经使用 Spring Framework 3.0.5 和 Spring Security 3.0.5 多次了。我知道Spring框架使用DI和AOP。我还知道 Spring Security
spring - 避免在单个 jar 中合并多个 spring 依赖项时覆盖 spring.handlers/spring.schemas 的想法
我收到错误 Unable to Location NamespaceHandler when using context:annotation-config running (java -jar) 由
spring-mvc - spring boot/spring web app内嵌版本号
在 Spring 应用程序中嵌入唯一版本号的策略是什么？我有一个使用 Spring Boot 和 Spring Web 的应用程序。它已经足够成熟，我想对其进行版本控制并在运行时看到它显示在屏幕上
spring - 当存在两个或多个具有相同名称的实体时选择默认实体 - Spring Boot、Spring Data JPA、
我正在使用 spring data jpa 进行持久化。如果存在多个具有相同名称的实体，是否有一种方法可以将一个实体标记为默认值。类似@Primary注解的东西用来解决多个bean的依赖问题 @Ent
spring - spring DAOSupport有什么优势
我阅读了 Spring 框架的 DAOSupport 类。但是我无法理解这些 DAOSuport 类的优点。在 DAOSupport 类中，我们调用 getXXXTemplate() 方法来获取特定的

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

spring-boot - 即使在发送 SIGTERM 之后，Kubernetes 也会将流量发送到 pod