gpt4 book ai didi

Kubernetes - 主节点中的 kube-system Pod 在工作节点加入后不断重启

转载 作者:行者123 更新时间:2023-12-05 00:13:08 24 4
gpt4 key购买 nike

我已关注此 tutorial而这个 tutorialthis one但我在过去 3 天面临同样的问题。

我可以通过以下步骤正确设置主节点:

kubeadm init

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

export kubever=$(kubectl version | base64 | tr -d ‘\’)
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$kubever"

一切似乎都很好
kubectl get all --namespace=kube-system

然后,

在工作节点上:
kubeadm join --token 864655.fdf6d0b389867b79 192.168.100.17:6443 --discovery-token-ca-cert-hash sha256:a2d840808b17b53b9612e6271ccde489f13dbede7d354f97188d0faa9e210af2

输出似乎很好,如下所示:
[preflight] Running pre-flight checks.
[WARNING FileExisting-crictl]: crictl not found in system path
[preflight] Starting the kubelet service
[discovery] Trying to connect to API Server "192.168.100.17:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://192.168.100.17:6443"
[discovery] Requesting info from "https://192.168.100.17:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "192.168.100.17:6443"
[discovery] Successfully established connection with API Server "192.168.100.17:6443"

This node has joined the cluster:
* Certificate signing request was sent to master and a response
was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the master to see this node join the cluster.

但是一旦我运行这个命令,一切都会崩溃。这
kubectl get all --namespace=kube-system

开始显示所有 Pod 一直在重新启动。状态在 Pending 和 Running 之间不断变化,有时一些 pod 甚至会消失,并可能具有 ContainerCreating 状态等。
NAME                                READY     STATUS    RESTARTS   AGE
po/etcd-ubuntu 0/1 Pending 0 0s
po/kube-controller-manager-ubuntu 0/1 Pending 0 0s
po/kube-dns-6f4fd4bdf-cmcfk 3/3 Running 0 13m
po/kube-proxy-2chb6 1/1 Running 0 13m
po/kube-scheduler-ubuntu 0/1 Pending 0 0s
po/weave-net-ptdxr 2/2 Running 0 11m

我也试过第二个教程,用法兰绒,并得到完全相同的问题。

我的设置

我创建了两个新虚拟机,在 VMware 上全新安装了 Ubuntu 17.10,每个虚拟机具有 2 个处理器/2 核 6 GB 内存和 50 GB 硬盘。我的物理机是 i7-6700k,内存为 32GB。
我在它们两个上都安装了 kubeadm、kubelet 和 docker,然后按照上面提到的步骤操作。

我也尝试过在 VMware 上的 NAT 和 Bridge 之间切换,但没有任何改变。

两个具有桥接网络的虚拟机的初始 IP 是 192.168.100.12 和 192.168.100.17。 hostname -I对于主人:
192.168.100.17 172.17.0.1 10.32.0.1 10.32.0.2
hostname -I对于工作节点:
192.168.100.12 172.17.0.1 10.44.0.0 10.32.0.1
journalctl -xeu kubelet显示以下内容:

https://gist.github.com/saad749/9a771a3460bf88c274498b5bc4b7fd84

在尝试使用法兰绒(仍然是同样的问题)时,结果来自
kubectl describe nodes



https://gist.github.com/saad749/d24c453c8b4e663e9abf572a0fb38bf4

我在 kubeadm init 之前遗漏了任何步骤吗?我应该更改 IP 地址(更改为什么)?有没有我应该查看的特定日志?有没有更全面的教程?
所有问题都在 kubeadm 加入工作节点后开始,我可以在主节点或任何其他东西上部署 kubernetes,它工作正常。


更新:

即使应用了 errordeveloper 的建议,同样的问题仍然存在。

我将以下标志添加到 kubeadm init:
--apiserver-advertise-address 192.168.100.17

我将 kubeadm.conf 更新为以下内容并重新加载并重新启动:
https://gist.github.com/saad749/c7149c87ec3e75a40586f626cf04279a

并尝试更改集群 dns
https://gist.github.com/saad749/5fa66bebc22841e58119333e75600e40

这是初始化主服务器后的日志:
kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system etcd-ubuntu 1/1 Running 0 22s 192.168.100.17 ubuntu
kube-system kube-apiserver-ubuntu 1/1 Running 0 29s 192.168.100.17 ubuntu
kube-system kube-controller-manager-ubuntu 1/1 Running 0 13s 192.168.100.17 ubuntu
kube-system kube-dns-6f4fd4bdf-wfqhb 3/3 Running 0 1m 10.32.0.7 ubuntu
kube-system kube-proxy-h4hz9 1/1 Running 0 1m 192.168.100.17 ubuntu
kube-system kube-scheduler-ubuntu 1/1 Running 0 34s 192.168.100.17 ubuntu
kube-system weave-net-fkgnh 2/2 Running 0 32s 192.168.100.17 ubuntu

主机名 -i 结果:
kube-master@ubuntu:~$ hostname -I
192.168.100.17 172.17.0.1 10.32.0.1 10.32.0.2 10.32.0.3 10.32.0.4 10.32.0.5 10.32.0.6 10.244.0.0 10.244.0.1
kube-master@ubuntu:~$ hostname -i
192.168.100.17

结果来自:
kubectl describe nodes

https://gist.github.com/saad749/8f460650182a04d0ddf3158a52761a9a

内部 IP 现在似乎是正确的。

从第二个节点加入后,会发生这种情况:
kube-master@ubuntu:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ubuntu Ready master 49m v1.9.3
kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system kube-controller-manager-ubuntu 0/1 Pending 0 0s <none> ubuntu
kube-system kube-dns-6f4fd4bdf-wfqhb 0/3 ContainerCreating 0 49m <none> ubuntu
kube-system kube-proxy-h4hz9 1/1 Running 0 49m 192.168.100.17 ubuntu
kube-system kube-scheduler-ubuntu 1/1 Running 0 1s 192.168.100.17 ubuntu
kube-system weave-net-fkgnh 2/2 Running 0 48m 192.168.100.17 ubuntu

ifconfig -a 结果:

https://gist.github.com/saad749/63a5a52bd3246ff72477b2aca7d158d0

journalctl -xeu kubelet 结果

https://gist.github.com/saad749/8a60870b35f93df8565e66cb208aff32

有时,pods IP 显示为 192.168.100.12,这是非主第二节点的 IP。
kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system etcd-ubuntu 0/1 Pending 0 0s <none> ubuntu
kube-system kube-apiserver-ubuntu 0/1 Pending 0 0s <none> ubuntu
kube-system kube-controller-manager-ubuntu 1/1 Running 0 0s 192.168.100.12 ubuntu
kube-system kube-dns-6f4fd4bdf-wfqhb 2/3 Running 0 3h 10.32.0.7 ubuntu
kube-system kube-proxy-h4hz9 1/1 Running 0 3h 192.168.100.12 ubuntu
kube-system kube-scheduler-ubuntu 0/1 Pending 0 0s <none> ubuntu
kube-system weave-net-fkgnh 2/2 Running 1 3h 192.168.100.17 ubuntu

kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system kube-dns-6f4fd4bdf-wfqhb 3/3 Running 0 3h 10.32.0.7 ubuntu
kube-system kube-proxy-h4hz9 1/1 Running 0 3h 192.168.100.12 ubuntu
kube-system weave-net-fkgnh 2/2 Running 0 3h 192.168.100.12 ubuntu


kubectl describe nodes
Name: ubuntu
Roles: master
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/hostname=ubuntu
node-role.kubernetes.io/master=
Annotations: node.alpha.kubernetes.io/ttl=0
volumes.kubernetes.io/controller-managed-attach-detach=true
Taints: node-role.kubernetes.io/master:NoSchedule
CreationTimestamp: Fri, 02 Mar 2018 08:21:47 -0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
OutOfDisk False Fri, 02 Mar 2018 11:38:36 -0800 Fri, 02 Mar 2018 08:21:43 -0800 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Fri, 02 Mar 2018 11:38:36 -0800 Fri, 02 Mar 2018 08:21:43 -0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Fri, 02 Mar 2018 11:38:36 -0800 Fri, 02 Mar 2018 08:21:43 -0800 KubeletHasNoDiskPressure kubelet has no disk pressure
Ready True Fri, 02 Mar 2018 11:38:36 -0800 Fri, 02 Mar 2018 11:28:25 -0800 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: 192.168.100.12
Hostname: ubuntu
Capacity:
cpu: 4
memory: 6080832Ki
pods: 110
Allocatable:
cpu: 4
memory: 5978432Ki
pods: 110
System Info:
Machine ID: 59bf65b835b242a3aa182f4b8a542219
System UUID: 0C3C4D56-4747-D59E-EE09-F16F2793677E
Boot ID: 658b4a08-d724-425e-9246-2b41995ecc46
Kernel Version: 4.13.0-36-generic
OS Image: Ubuntu 17.10
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://1.13.1
Kubelet Version: v1.9.3
Kube-Proxy Version: v1.9.3
ExternalID: ubuntu
Non-terminated Pods: (3 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
kube-system kube-dns-6f4fd4bdf-wfqhb 260m (6%) 0 (0%) 110Mi (1%) 170Mi (2%)
kube-system kube-proxy-h4hz9 0 (0%) 0 (0%) 0 (0%) 0 (0%)
kube-system weave-net-fkgnh 20m (0%) 0 (0%) 0 (0%) 0 (0%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
280m (7%) 0 (0%) 110Mi (1%) 170Mi (2%)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Rebooted 12m (x814 over 2h) kubelet, ubuntu Node ubuntu has been rebooted, boot id: 16efd500-a2a5-446f-ba25-1187857996e0
Normal NodeHasNoDiskPressure 10m kubelet, ubuntu Node ubuntu status is now: NodeHasNoDiskPressure
Normal Starting 10m kubelet, ubuntu Starting kubelet.
Normal NodeAllocatableEnforced 10m kubelet, ubuntu Updated Node Allocatable limit across pods
Normal NodeHasSufficientDisk 10m kubelet, ubuntu Node ubuntu status is now: NodeHasSufficientDisk
Normal NodeHasSufficientMemory 10m kubelet, ubuntu Node ubuntu status is now: NodeHasSufficientMemory
Normal NodeNotReady 10m kubelet, ubuntu Node ubuntu status is now: NodeNotReady
Warning Rebooted 2m (x870 over 2h) kubelet, ubuntu Node ubuntu has been rebooted, boot id: 658b4a08-d724-425e-9246-2b41995ecc46
Warning Rebooted 15s (x60 over 10m) kubelet, ubuntu Node ubuntu has been rebooted, boot id: 16efd500-a2a5-446f-ba25-1187857996e0

我究竟做错了什么?

最佳答案

因此,在遵循@errordeveloper 的建议并仍然碰壁之后,我能够解决这个非常简单的问题。

我的两个虚拟机都具有相同的主机名。

hostname -f 

会回来
ubuntu

在两者上,这显然会导致 kubernetes 出现问题。

我更改了非主节点上的名称
hostnamectl set-hostname kminion

并在以下文件中:
/etc/hostname
/etc/hosts

一切都很顺利!

关于Kubernetes - 主节点中的 kube-system Pod 在工作节点加入后不断重启,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49017719/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com