K8s 保证程序稳定性，确保始终在线

2026/03/09

一些面向用户的程序必须始终在线，即使处于发布或者维护中。

部署策略

RollingUpdate 策略确保新的pod准备好，在结束旧的pod之前，在部署过程中，始终确保程序可用性。

kind: Deployment
spec:
    replicas: 2
    strategy:
        type: RollingUpdate
        rollingUpdate:
            maxSurge: 1 // 允许临时有额外的一个pod，加快部署在控制资源使用中
            maxUnavailable: 0 // 确保至少有一个pod始终在运行，并且在更新过程中确保流量

.....................

最佳实践和策略

不要中断程序流量当部署新的版本的时候。
使用Readiness & liveness probes，k8s只会向健康的pod导流量。
配置graceful shutdown，pod会在其所有的请求结束后才会结束其运行。
使用负载测试来验证设置正确，观察程序的不可用时长在部署的过程中。

readiness liveness 设置

probes可以确保k8s监控pod的可用性，来防止需求在到达服务器后不能被即时处理。

kind: Deployment
spec:
    template:
        spec:
            containers:
                - name: appname
                ..........
                livenessProbe:
                    failureThreshold: 3
                    httpGet:
                        path: /actuator/health/liveness
                        port: 8080
                        shema: HTTP
                    intialDelaySeconds: 100
                    periodSeconds: 10
                    successThreshold: 1
                    timeoutSeconds: 30
                readinessProbe:
                    failureThreshold: 3
                    httpGet:
                        path: /actuator/health/liveness
                        port: 8080
                        shema: HTTP
                    intialDelaySeconds: 30
                    periodSeconds: 10
                    successThreshold: 1
                    timeoutSeconds: 20

关键配置：

Readiness probe 从服务终端移除没有准备好的pods，阻止用户的请求不能被处理的情况。
Liveness probe 自动重启pod，当它们停滞或者无响应，始终维持有效的实例。
调整initialDelaySeconds 基于你自己的程序启动时间：设置readiness probe delay来符合典型的启动时长，以及liveness probe delay要轻微长于前面的delay从而避免在初始化过程中重启。

假如你在跑spring boot 程序，你有可能使用actuator health endpoint，如上所示：

path = /actuator/health/readiness(readiness probe)
path = /actuator/health/liveness(liveness probe)
maven 依赖

1 2	groupid = org.springframework.boot artifactid = spring-boot-starter-actuator

在application.properties必须的设置：
management.endpoint.health.probes.enabled = true
management.health.livenessState.enabled = true
management.health.readinessState.enabled = true

这些属性使得liveness和readiness 分离，从而允许k8s能够分辨程序已经启动，和程序能够处理流量。假如你在跑reactjs程序使用nginx，可能是使用home page作为health endpoint
path = /

Graceful shutdown

graceful shutdown 允许请求队列（in flight request）在pod终结之前完成，从而阻止活跃的用户收到错误链接。

kind: Deployment
spec:
   template:
       spec:
           terminationGracePeriodSeconds: 60
           containers:
               - name: appname
               .............
               lifecycle:
                   preStop:
                       exec:
                           command: ["/bin/sh","-C", "sleep 10"]

关键点：

preStop 绑定结束延迟，从而给load balancer和服务来去除prod的注册。
terminationGracePeriodSeconds 定义了graceful shutdown的最大允许时间在强制结束之前。
应用应该能够处理SIGTERM信号，从而停滞接受新的请求当处理已经存在的请求。

对于spring boot程序，graceful shutdown 是内置的功能从版本2.3起，在application.properties中启用：
server.shutdown = graceful
spring.lifecycle.timeout-per-shutdown-phase = 30s
这样确保了spring boot会等到活跃的请求完成后关机。

在部署过程中观察停机时间

在部署过程中的负载测试可以验证在真实情况下你的0停机时间的设置。可以使用Gatling或者其他工具依据你自己的选择。

关键点：

在触发部署前进行负载测试，在部署过程中持续监控。
每秒轮询每一个程序的reading probe endpoint，来观察pod的可用性。
推荐设置场景来模拟真实用户行为。
监控HTTP状态码来评估停机时间。
观察响应时间百分比来评估性能影响。
一个成功的不停机部署应该显示在部署过程中没有任何错误。

测试流程：

启动gatling测试，持续监控pod的状态。
触发部署，比如更新镜像版本应用新的配置。
继续监控，观察pod的状态和响应时间。
确认在部署过程中没有错误，并且响应时间保持在可接受的范围内。

tips：可以强制pod rollout start来快速测试不停机设置，从而不用重新build或者部署整个应用。这个模拟了部署过程中的pod更新，来验证你的设置是否正确。对于一个平台有两个后端和一个前端是非常好的例子。

cd ~/ucp bundle-<user_id>-kube-dev
source ./env.sh

kubectl rollout restart deployment/backend-deployment -n namespace
kubectl rollout restart deployment/backend2-deployment -n namespace
kubectl rollout restart deployment/frontend-deployment -n namespace

一个成功的部署应该显示在部署过程中没有任何错误，并且响应时间保持在可接受的范围内。Gatling的报告应该显示在部署过程中没有任何错误。所有的场景应该成功完成，显示在部署过程中没有任何错误，并且响应时间保持在可接受的范围内。一个失败的部署可能显示在部署过程中有错误，或者响应时间显著增加，或者场景失败。通过分析报告，你可以识别出部署过程中可能存在的问题，并进行相应的调整来确保未来的部署能够实现真正的0停机时间，例如：忽略的readiness probes，没有preStop hook，或者maxUnavailable > 0

Gatling 模拟参考：
以下是Galting 3.10.5的一个简单示例，模拟用户行为来测试部署过程中的可用性：

import io.gatling.javaapi.core.*;
import io.gatling.javaapi.http.HttpProtocolBuilder ;

import static io.gatling.javaapi.core.CoreDsl.*;
import static io.gatling.javaapi.http.HttpDsl.*;

public class PingSimulation extends AbstractSimulation {

    private static final String CONNECT_CLIENT_ID = "xxx";
    private static final String CONNECT_CLIENT_SECRET = "yyy";
    private static final int DURATION = 300; // 测试持续时间，单位为秒

    static ChainBuilder authChain connect() {
        return exec(http("Connect")
            .post("/auth/token") // 替换为你的认证endpoint
            .formParam("grant_type", "client_credentials")
            .formParam("scope", "id profile scope.v1")
            .basicAuth(CONNECT_CLIENT_ID, CONNECT_CLIENT_SECRET)
            .headers("Accept", "application/json")
            .check(status().is(200),jsonPath("$.access_token").saveAs("accessToken")));
    }

    static HttpProtocolBuilder protocol(String baseUrl) {
        return http.baseUrl(baseUrl)
            .acceptHeader("text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8")
            .acceptLanguageHeader("en-US,en;q=0.5")
            .acceptEncodingHeader("gzip, deflate")
            .userAgentHeader("Mozilla/5.0 (Windows NT 10.0; Win64; x64) Chrome/58.0.3029.110"); 
    }
    {
        var backendProtocol = protocol("http://backend-service:8080");
        var backend2Protocol = protocol("http://backend2-service:8080");
        var frontendProtocol = protocol("http://frontend-service:8080");
        var injectStep = rampUsersPerSec(DURATION).during(DURATION); // 逐渐增加用户数，持续时间为DURATION秒
    }

    setUp(
        scenario("Ping Backend") // 替换为你的场景名称
            .exec(
                connect(),
                exec(http("Ping Backend")
                    .get("/ping") // 替换为你的ping endpoint
                    .header("Authorization", "Bearer #{accessToken}")
                )
            )
            .inject(injectStep)
            .protocols(backendProtocol),
        scenario("Backend1 - readiness") // 替换为你的场景名称
            .exec(
                exec(http("Backend1 - health check")
                .get("actuator/health/readiness"))) // 替换为你的ping endpoint
                .injectOpen(injectStep) // 逐渐增加用户数，持续时间为DURATION秒
                .protocols(backendProtocol),
        scenario("Backend2 - readiness") // 替换为你的场景名称
            .exec(
                exec(http("Backend2 - health check")
                .get("actuator/health/readiness"))) // 替换为你的ping endpoint
                .injectOpen(injectStep) // 逐渐增加用户数，持续时间为DURATION秒
                .protocols(backend2Protocol), // 使用backend2Protocol进行协议配置
        scenario("Frontend - Home Page") // 替换为你的场景名称
            .exec(
               connect(),
                exec(http("Frontend - Home Page")
                    .get("/") // 替换为你的home page endpoint
                    ) )
                    .injectOpen(injectStep) // 逐渐增加用户数，持续时间为DURATION秒
                    .protocols(frontendProtocol) // 使用frontendProtocol进行协议配置 
        .assetions(
            global().failedRequests().count().is(0L); // 断言失败请求的百分比小于1%
    )
}

Kubernetes,K8s,RollingUpdate,Pod,无停机部署,高可用