gpt4 book ai didi

kubernetes - K8s NodePort 服务为 “unreachable by IP” 仅在集群中的 2/4 从站上

转载 作者:行者123 更新时间:2023-12-04 04:42:44 30 4
gpt4 key购买 nike

我使用 kubeadm 创建了一个包含 5 个虚拟机的 K8s 集群(运行 Ubuntu 16.04.3 LTS 的 1 个主机和 4 个从机) .我用过 flannel在集群中设置网络。我能够成功部署一个应用程序。然后,我通过 NodePort 服务公开了它。从这里开始,事情对我来说变得复杂了。

在开始之前,我禁用了默认 firewalld主节点和节点上的服务。

据我了解 K8s Services doc ,NodePort 类型在集群中的所有节点上公开服务。但是,当我创建它时,该服务仅在集群中的 4 个节点中的 2 个节点上公开。我猜这不是预期的行为(对吗?)

对于故障排除,以下是一些资源规范:

root@vm-vivekse-003:~# kubectl get nodes
NAME STATUS AGE VERSION
vm-deepejai-00b Ready 5m v1.7.3
vm-plashkar-006 Ready 4d v1.7.3
vm-rosnthom-00f Ready 4d v1.7.3
vm-vivekse-003 Ready 4d v1.7.3 //the master
vm-vivekse-004 Ready 16h v1.7.3

root@vm-vivekse-003:~# kubectl get pods -o wide -n playground
NAME READY STATUS RESTARTS AGE IP NODE
kubernetes-bootcamp-2457653786-9qk80 1/1 Running 0 2d 10.244.3.6 vm-rosnthom-00f
springboot-helloworld-2842952983-rw0gc 1/1 Running 0 1d 10.244.3.7 vm-rosnthom-00f

root@vm-vivekse-003:~# kubectl get svc -o wide -n playground
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
sb-hw-svc 10.101.180.19 <nodes> 9000:30847/TCP 5h run=springboot-helloworld

root@vm-vivekse-003:~# kubectl describe svc sb-hw-svc -n playground
Name: sb-hw-svc
Namespace: playground
Labels: <none>
Annotations: <none>
Selector: run=springboot-helloworld
Type: NodePort
IP: 10.101.180.19
Port: <unset> 9000/TCP
NodePort: <unset> 30847/TCP
Endpoints: 10.244.3.7:9000
Session Affinity: None
Events: <none>

root@vm-vivekse-003:~# kubectl get endpoints sb-hw-svc -n playground -o yaml
apiVersion: v1
kind: Endpoints
metadata:
creationTimestamp: 2017-08-09T06:28:06Z
name: sb-hw-svc
namespace: playground
resourceVersion: "588958"
selfLink: /api/v1/namespaces/playground/endpoints/sb-hw-svc
uid: e76d9cc1-7ccb-11e7-bc6a-fa163efaba6b
subsets:
- addresses:
- ip: 10.244.3.7
nodeName: vm-rosnthom-00f
targetRef:
kind: Pod
name: springboot-helloworld-2842952983-rw0gc
namespace: playground
resourceVersion: "473859"
uid: 16d9db68-7c1a-11e7-bc6a-fa163efaba6b
ports:
- port: 9000
protocol: TCP

经过一番修补后,我意识到在这 2 个“故障”节点上,这些服务在这些主机本身中不可用。

Node01(工作):
root@vm-vivekse-004:~# curl 127.0.0.1:30847      //<localhost>:<nodeport>
Hello Docker World!!
root@vm-vivekse-004:~# curl 10.101.180.19:9000 //<cluster-ip>:<port>
Hello Docker World!!
root@vm-vivekse-004:~# curl 10.244.3.7:9000 //<pod-ip>:<port>
Hello Docker World!!

Node02(工作):
root@vm-rosnthom-00f:~# curl 127.0.0.1:30847
Hello Docker World!!
root@vm-rosnthom-00f:~# curl 10.101.180.19:9000
Hello Docker World!!
root@vm-rosnthom-00f:~# curl 10.244.3.7:9000
Hello Docker World!!

Node03(不工作):
root@vm-plashkar-006:~# curl 127.0.0.1:30847
curl: (7) Failed to connect to 127.0.0.1 port 30847: Connection timed out
root@vm-plashkar-006:~# curl 10.101.180.19:9000
curl: (7) Failed to connect to 10.101.180.19 port 9000: Connection timed out
root@vm-plashkar-006:~# curl 10.244.3.7:9000
curl: (7) Failed to connect to 10.244.3.7 port 9000: Connection timed out

Node04(不工作):
root@vm-deepejai-00b:/# curl 127.0.0.1:30847
curl: (7) Failed to connect to 127.0.0.1 port 30847: Connection timed out
root@vm-deepejai-00b:/# curl 10.101.180.19:9000
curl: (7) Failed to connect to 10.101.180.19 port 9000: Connection timed out
root@vm-deepejai-00b:/# curl 10.244.3.7:9000
curl: (7) Failed to connect to 10.244.3.7 port 9000: Connection timed out

试过 netstattelnet在所有 4 个奴隶上。这是输出:

Node01(工作主机):
root@vm-vivekse-004:~# netstat -tulpn | grep 30847
tcp6 0 0 :::30847 :::* LISTEN 27808/kube-proxy
root@vm-vivekse-004:~# telnet 127.0.0.1 30847
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.

Node02(工作主机):
root@vm-rosnthom-00f:~# netstat -tulpn | grep 30847
tcp6 0 0 :::30847 :::* LISTEN 11842/kube-proxy
root@vm-rosnthom-00f:~# telnet 127.0.0.1 30847
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.

Node03(不工作的主机):
root@vm-plashkar-006:~# netstat -tulpn | grep 30847
tcp6 0 0 :::30847 :::* LISTEN 7791/kube-proxy
root@vm-plashkar-006:~# telnet 127.0.0.1 30847
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection timed out

Node04(不工作的主机):
root@vm-deepejai-00b:/# netstat -tulpn | grep 30847
tcp6 0 0 :::30847 :::* LISTEN 689/kube-proxy
root@vm-deepejai-00b:/# telnet 127.0.0.1 30847
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection timed out

补充资料:

来自 kubectl get pods输出,可以看到pod实际上部署在slave上 vm-rosnthom-00f .我可以 ping该主机来自所有 5 个虚拟机和 curl vm-rosnthom-00f:30847也适用于所有 VM。

我可以清楚地看到内部集群网络困惑,但我不知道如何解决! iptables -L因为所有从站都是相同的,甚至所有从站的本地环回( ifconfig lo )都已启动并运行。我完全不知道如何修复它!

最佳答案

如果你想从集群中的任何节点访问服务,你需要精细的服务类型为 ClusterIP .由于您将服务类型定义为 NodePort ,您可以从运行服务的节点进行连接。

我上面的答案不正确,根据文档,我们应该能够从任何 NodeIP:Nodeport 连接.但它也不适用于我的集群。

https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services---service-types

NodePort: Exposes the service on each Node’s IP at a static port (the NodePort). A ClusterIP service, to which the NodePort service will route, is automatically created. You’ll be able to contact the NodePort service, from outside the cluster, by requesting :.



我的节点 ip forward 之一未设置。我能够使用 NodeIP:nodePort 连接我的服务
sysctl -w net.ipv4.ip_forward=1

关于kubernetes - K8s NodePort 服务为 “unreachable by IP” 仅在集群中的 2/4 从站上,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45595662/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com