gpt4 book ai didi

kubernetes - NEG 说 Pod 是 'unhealthy' ,但实际上 Pod 是健康的

转载 作者:行者123 更新时间:2023-12-02 12:14:54 24 4
gpt4 key购买 nike

我正在尝试在 GCP 上使用 Ingress 应用 gRPC 负载平衡,为此我引用了 this例子。该示例显示 gRPC 负载平衡通过两种方式工作(一种使用 envoy side-car,另一种是 HTTP mux,在同一个 Pod 上处理 gRPC/HTTP-health-check。)但是,envoy 代理示例不起作用.

让我感到困惑的是,Pod 正在运行/健康(由 kubectl describekubectl logs 确认)

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
fe-deployment-757ffcbd57-4w446 2/2 Running 0 4m22s
fe-deployment-757ffcbd57-xrrm9 2/2 Running 0 4m22s


$ kubectl describe pod fe-deployment-757ffcbd57-4w446
Name: fe-deployment-757ffcbd57-4w446
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc/10.128.0.64
Start Time: Thu, 26 Sep 2019 16:15:18 +0900
Labels: app=fe
pod-template-hash=757ffcbd57
Annotations: kubernetes.io/limit-ranger: LimitRanger plugin set: cpu request for container fe-envoy; cpu request for container fe-container
Status: Running
IP: 10.56.1.29
Controlled By: ReplicaSet/fe-deployment-757ffcbd57
Containers:
fe-envoy:
Container ID: docker://b4789909494f7eeb8d3af66cb59168e009c582d412d8ca683a7f435559989421
Image: envoyproxy/envoy:latest
Image ID: docker-pullable://envoyproxy/envoy@sha256:9ef9c4fd6189fdb903929dc5aa0492a51d6783777de65e567382ac7d9a28106b
Port: 8080/TCP
Host Port: 0/TCP
Command:
/usr/local/bin/envoy
Args:
-c
/data/config/envoy.yaml
State: Running
Started: Thu, 26 Sep 2019 16:15:19 +0900
Ready: True
Restart Count: 0
Requests:
cpu: 100m
Liveness: http-get https://:fe/_ah/health delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get https://:fe/_ah/health delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/data/certs from certs-volume (rw)
/data/config from envoy-config-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-c7nqc (ro)
fe-container:
Container ID: docker://a533224d3ea8b5e4d5e268a616d73762b37df69f434342459f35caa8fac32dab
Image: salrashid123/grpc_only_backend
Image ID: docker-pullable://salrashid123/grpc_only_backend@sha256:ebfac594116445dd67aff7c9e7a619d73222b60947e46ef65ee6d918db3e1f4b
Port: 50051/TCP
Host Port: 0/TCP
Command:
/grpc_server
Args:
--grpcport
:50051
--insecure
State: Running
Started: Thu, 26 Sep 2019 16:15:20 +0900
Ready: True
Restart Count: 0
Requests:
cpu: 100m
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-c7nqc (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
certs-volume:
Type: Secret (a volume populated by a Secret)
SecretName: fe-secret
Optional: false
envoy-config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: envoy-configmap
Optional: false
default-token-c7nqc:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-c7nqc
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m25s default-scheduler Successfully assigned default/fe-deployment-757ffcbd57-4w446 to gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc
Normal Pulled 4m25s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc Container image "envoyproxy/envoy:latest" already present on machine
Normal Created 4m24s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc Created container
Normal Started 4m24s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc Started container
Normal Pulling 4m24s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc pulling image "salrashid123/grpc_only_backend"
Normal Pulled 4m24s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc Successfully pulled image "salrashid123/grpc_only_backend"
Normal Created 4m24s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc Created container
Normal Started 4m23s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc Started container
Warning Unhealthy 4m10s (x2 over 4m20s) kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc Readiness probe failed: HTTP probe failed with statuscode: 503
Warning Unhealthy 4m9s (x2 over 4m19s) kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc Liveness probe failed: HTTP probe failed with statuscode: 503


$ kubectl describe pod fe-deployment-757ffcbd57-xrrm9
Name: fe-deployment-757ffcbd57-xrrm9
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9/10.128.0.22
Start Time: Thu, 26 Sep 2019 16:15:18 +0900
Labels: app=fe
pod-template-hash=757ffcbd57
Annotations: kubernetes.io/limit-ranger: LimitRanger plugin set: cpu request for container fe-envoy; cpu request for container fe-container
Status: Running
IP: 10.56.0.23
Controlled By: ReplicaSet/fe-deployment-757ffcbd57
Containers:
fe-envoy:
Container ID: docker://255dd6cab1e681e30ccfe158f7d72540576788dbf6be60b703982a7ecbb310b1
Image: envoyproxy/envoy:latest
Image ID: docker-pullable://envoyproxy/envoy@sha256:9ef9c4fd6189fdb903929dc5aa0492a51d6783777de65e567382ac7d9a28106b
Port: 8080/TCP
Host Port: 0/TCP
Command:
/usr/local/bin/envoy
Args:
-c
/data/config/envoy.yaml
State: Running
Started: Thu, 26 Sep 2019 16:15:19 +0900
Ready: True
Restart Count: 0
Requests:
cpu: 100m
Liveness: http-get https://:fe/_ah/health delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get https://:fe/_ah/health delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/data/certs from certs-volume (rw)
/data/config from envoy-config-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-c7nqc (ro)
fe-container:
Container ID: docker://f6a0246129cc89da846c473daaa1c1770d2b5419b6015098b0d4f35782b0a9da
Image: salrashid123/grpc_only_backend
Image ID: docker-pullable://salrashid123/grpc_only_backend@sha256:ebfac594116445dd67aff7c9e7a619d73222b60947e46ef65ee6d918db3e1f4b
Port: 50051/TCP
Host Port: 0/TCP
Command:
/grpc_server
Args:
--grpcport
:50051
--insecure
State: Running
Started: Thu, 26 Sep 2019 16:15:20 +0900
Ready: True
Restart Count: 0
Requests:
cpu: 100m
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-c7nqc (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
certs-volume:
Type: Secret (a volume populated by a Secret)
SecretName: fe-secret
Optional: false
envoy-config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: envoy-configmap
Optional: false
default-token-c7nqc:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-c7nqc
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 5m8s default-scheduler Successfully assigned default/fe-deployment-757ffcbd57-xrrm9 to gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9
Normal Pulled 5m8s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9 Container image "envoyproxy/envoy:latest" already present on machine
Normal Created 5m7s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9 Created container
Normal Started 5m7s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9 Started container
Normal Pulling 5m7s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9 pulling image "salrashid123/grpc_only_backend"
Normal Pulled 5m7s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9 Successfully pulled image "salrashid123/grpc_only_backend"
Normal Created 5m7s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9 Created container
Normal Started 5m6s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9 Started container
Warning Unhealthy 4m53s (x2 over 5m3s) kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9 Readiness probe failed: HTTP probe failed with statuscode: 503
Warning Unhealthy 4m52s (x2 over 5m2s) kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9 Liveness probe failed: HTTP probe failed with statuscode: 503


$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
fe-srv-ingress NodePort 10.123.5.165 <none> 8080:30816/TCP 6m43s
fe-srv-lb LoadBalancer 10.123.15.36 35.224.69.60 50051:30592/TCP 6m42s
kubernetes ClusterIP 10.123.0.1 <none> 443/TCP 2d2h


$ kubectl describe service fe-srv-ingress
Name: fe-srv-ingress
Namespace: default
Labels: type=fe-srv
Annotations: cloud.google.com/neg: {"ingress": true}
cloud.google.com/neg-status:
{"network_endpoint_groups":{"8080":"k8s1-963b7b91-default-fe-srv-ingress-8080-e459b0d2"},"zones":["us-central1-a"]}
kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{"cloud.google.com/neg":"{\"ingress\": true}","service.alpha.kubernetes.io/a...
service.alpha.kubernetes.io/app-protocols: {"fe":"HTTP2"}
Selector: app=fe
Type: NodePort
IP: 10.123.5.165
Port: fe 8080/TCP
TargetPort: 8080/TCP
NodePort: fe 30816/TCP
Endpoints: 10.56.0.23:8080,10.56.1.29:8080
Session Affinity: None
External Traffic Policy: Cluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Create 6m47s neg-controller Created NEG "k8s1-963b7b91-default-fe-srv-ingress-8080-e459b0d2" for default/fe-srv-ingress-8080/8080 in "us-central1-a".
Normal Attach 6m40s neg-controller Attach 2 network endpoint(s) (NEG "k8s1-963b7b91-default-fe-srv-ingress-8080-e459b0d2" in zone "us-central1-a")

但是 NEG 说它们不健康(所以 Ingress 也说后端不健康)。

我找不到导致这种情况的原因。有谁知道如何解决这个问题?

测试环境:
  • GKE,1.13.7-gke.8(已启用 VPC)
  • Ingress 上的默认 HTTP(s) 负载均衡器


  • 我使用的 YAML 文件(与前面提到的示例相同),

    envoy-configmap.yaml
    apiVersion: v1
    kind: ConfigMap
    metadata:
    name: envoy-configmap
    labels:
    app: fe
    data:
    config: |-
    ---
    admin:
    access_log_path: /dev/null
    address:
    socket_address:
    address: 127.0.0.1
    port_value: 9000
    node:
    cluster: service_greeter
    id: test-id
    static_resources:
    listeners:
    - name: listener_0
    address:
    socket_address: { address: 0.0.0.0, port_value: 8080 }
    filter_chains:
    - filters:
    - name: envoy.http_connection_manager
    config:
    stat_prefix: ingress_http
    codec_type: AUTO
    route_config:
    name: local_route
    virtual_hosts:
    - name: local_service
    domains: ["*"]
    routes:
    - match:
    path: "/echo.EchoServer/SayHello"
    route: { cluster: local_grpc_endpoint }
    http_filters:
    - name: envoy.lua
    config:
    inline_code: |
    package.path = "/etc/envoy/lua/?.lua;/usr/share/lua/5.1/nginx/?.lua;/etc/envoy/lua/" .. package.path
    function envoy_on_request(request_handle)

    if request_handle:headers():get(":path") == "/_ah/health" then
    local headers, body = request_handle:httpCall(
    "local_admin",
    {
    [":method"] = "GET",
    [":path"] = "/clusters",
    [":authority"] = "local_admin"
    },"", 50)


    str = "local_grpc_endpoint::127.0.0.1:50051::health_flags::healthy"
    if string.match(body, str) then
    request_handle:respond({[":status"] = "200"},"ok")
    else
    request_handle:logWarn("Envoy healthcheck failed")
    request_handle:respond({[":status"] = "503"},"unavailable")
    end
    end
    end
    - name: envoy.router
    typed_config: {}
    tls_context:
    common_tls_context:
    tls_certificates:
    - certificate_chain:
    filename: "/data/certs/tls.crt"
    private_key:
    filename: "/data/certs/tls.key"
    clusters:
    - name: local_grpc_endpoint
    connect_timeout: 0.05s
    type: STATIC
    http2_protocol_options: {}
    lb_policy: ROUND_ROBIN
    common_lb_config:
    healthy_panic_threshold:
    value: 50.0
    health_checks:
    - timeout: 1s
    interval: 5s
    interval_jitter: 1s
    no_traffic_interval: 5s
    unhealthy_threshold: 1
    healthy_threshold: 3
    grpc_health_check:
    service_name: "echo.EchoServer"
    authority: "server.domain.com"
    hosts:
    - socket_address:
    address: 127.0.0.1
    port_value: 50051
    - name: local_admin
    connect_timeout: 0.05s
    type: STATIC
    lb_policy: ROUND_ROBIN
    hosts:
    - socket_address:
    address: 127.0.0.1
    port_value: 9000

    fe-deployment.yaml
    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
    name: fe-deployment
    labels:
    app: fe
    spec:
    replicas: 2
    template:
    metadata:
    labels:
    app: fe
    spec:
    containers:

    - name: fe-envoy
    image: envoyproxy/envoy:latest
    imagePullPolicy: IfNotPresent
    livenessProbe:
    httpGet:
    path: /_ah/health
    scheme: HTTPS
    port: fe
    readinessProbe:
    httpGet:
    path: /_ah/health
    scheme: HTTPS
    port: fe
    ports:
    - name: fe
    containerPort: 8080
    protocol: TCP
    command: ["/usr/local/bin/envoy"]
    args: ["-c", "/data/config/envoy.yaml"]
    volumeMounts:
    - name: certs-volume
    mountPath: /data/certs
    - name: envoy-config-volume
    mountPath: /data/config

    - name: fe-container
    image: salrashid123/grpc_only_backend # This runs gRPC secure/insecure server using port argument(:50051). Port 50051 is also exposed on Dockerfile.
    imagePullPolicy: Always
    ports:
    - containerPort: 50051
    protocol: TCP
    command: ["/grpc_server"]
    args: ["--grpcport", ":50051", "--insecure"]

    volumes:
    - name: certs-volume
    secret:
    secretName: fe-secret
    - name: envoy-config-volume
    configMap:
    name: envoy-configmap
    items:
    - key: config
    path: envoy.yaml

    fe-srv-ingress.yaml
    ---
    apiVersion: v1
    kind: Service
    metadata:
    name: fe-srv-ingress
    labels:
    type: fe-srv
    annotations:
    service.alpha.kubernetes.io/app-protocols: '{"fe":"HTTP2"}'
    cloud.google.com/neg: '{"ingress": true}'
    spec:
    type: NodePort
    ports:
    - name: fe
    port: 8080
    protocol: TCP
    targetPort: 8080
    selector:
    app: fe

    fe-ingress.yaml
    apiVersion: extensions/v1beta1
    kind: Ingress
    metadata:
    name: fe-ingress
    annotations:
    kubernetes.io/ingress.allow-http: "false"
    spec:
    tls:
    - hosts:
    - server.domain.com
    secretName: fe-secret
    rules:
    - host: server.domain.com
    http:
    paths:
    - path: /echo.EchoServer/*
    backend:
    serviceName: fe-srv-ingress
    servicePort: 8080

    最佳答案

    我必须允许来自文档页面中指定为运行状况检查源的 IP 范围的任何流量 - 130.211.0.0/22, 35.191.0.0/16 ,在此处看到:https://cloud.google.com/kubernetes-engine/docs/how-to/standalone-neg
    我必须允许它用于默认网络和集群所在的新网络(区域)。
    当我添加这些防火墙规则时,运行状况检查可以到达 NEG 中公开的 pod,用作我们的 Http(s) 负载均衡器后端服务中的区域后端。

    可能有一个更严格的防火墙设置,但我只是偷工减料,并允许从上面引用的页面中声明为健康检查源范围的 IP 范围中的任何内容。

    关于kubernetes - NEG 说 Pod 是 'unhealthy' ,但实际上 Pod 是健康的,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58110208/

    24 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com