gpt4 book ai didi

networking - Prometheus Pod无法调用apiserver端点

转载 作者:行者123 更新时间:2023-12-02 12:28:10 24 4
gpt4 key购买 nike

我正在尝试通过helm install stable/prometheus将监视堆栈(prometheus + alertmanager + node_exporter等)设置到我设置的树莓派k8s集群(1个主节点+ 3个工作节点)上。

设法使所有必需的Pod运行。

pi-monitoring-prometheus-alertmanager-767cd8bc65-89hxt   2/2     Running            0          131m    10.17.2.56      kube2   <none>           <none>
pi-monitoring-prometheus-node-exporter-h86gt 1/1 Running 0 131m 192.168.1.212 kube2 <none> <none>
pi-monitoring-prometheus-node-exporter-kg957 1/1 Running 0 131m 192.168.1.211 kube1 <none> <none>
pi-monitoring-prometheus-node-exporter-x9wgb 1/1 Running 0 131m 192.168.1.213 kube3 <none> <none>
pi-monitoring-prometheus-pushgateway-799d4ff9d6-rdpkf 1/1 Running 0 131m 10.17.3.36 kube1 <none> <none>
pi-monitoring-prometheus-server-5d989754b6-gp69j 2/2 Running 0 98m 10.17.1.60 kube3 <none> <none>

但是,在将端口转发到Prometheus服务器端口9090并导航到 Targets页面之后,我意识到没有任何node_exporters被注册。

浏览日志,我发现了这个
evel=error ts=2020-04-12T05:15:05.083Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:333: Failed to list *v1.Node: Get https://10.18.0.1:443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp 10.18.0.1:443: i/o timeout"
level=error ts=2020-04-12T05:15:05.084Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:299: Failed to list *v1.Service: Get https://10.18.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.18.0.1:443: i/o timeout"
level=error ts=2020-04-12T05:15:05.084Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:261: Failed to list *v1.Endpoints: Get https://10.18.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.18.0.1:443: i/o timeout"
level=error ts=2020-04-12T05:15:05.085Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:262: Failed to list *v1.Service: Get https://10.18.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.18.0.1:443: i/o timeout"

问题:为什么Prometheus Pod无法调用apiserver端点?不太确定配置在哪里做错了

跟随 debug guide,实现的单个节点无法解析其他节点上的服务。

在过去1天的故障排除过程中,阅读了各种资料,但老实说,我什至不知道从哪里开始。

这些是在 kube-system命名空间中运行的Pod。希望这可以更好地了解我的系统的设置方式。
pi@kube4:~ $ kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-66bff467f8-nzvq8 1/1 Running 0 13d 10.17.0.2 kube4 <none> <none>
coredns-66bff467f8-z7wdb 1/1 Running 0 13d 10.17.0.3 kube4 <none> <none>
etcd-kube4 1/1 Running 0 13d 192.168.1.214 kube4 <none> <none>
kube-apiserver-kube4 1/1 Running 2 13d 192.168.1.214 kube4 <none> <none>
kube-controller-manager-kube4 1/1 Running 2 13d 192.168.1.214 kube4 <none> <none>
kube-flannel-ds-arm-8g9fb 1/1 Running 1 13d 192.168.1.212 kube2 <none> <none>
kube-flannel-ds-arm-c5qt9 1/1 Running 0 13d 192.168.1.214 kube4 <none> <none>
kube-flannel-ds-arm-q5pln 1/1 Running 1 13d 192.168.1.211 kube1 <none> <none>
kube-flannel-ds-arm-tkmn6 1/1 Running 1 13d 192.168.1.213 kube3 <none> <none>
kube-proxy-4zjjh 1/1 Running 0 13d 192.168.1.213 kube3 <none> <none>
kube-proxy-6mk2z 1/1 Running 0 13d 192.168.1.211 kube1 <none> <none>
kube-proxy-bbr8v 1/1 Running 0 13d 192.168.1.212 kube2 <none> <none>
kube-proxy-wfsbm 1/1 Running 0 13d 192.168.1.214 kube4 <none> <none>
kube-scheduler-kube4 1/1 Running 3 13d 192.168.1.214 kube4 <none> <none>

最佳答案

Flannel documentation状态:

NOTE: If kubeadm is used, then pass --pod-network-cidr=10.244.0.0/16 to kubeadm init to ensure that the podCIDR is set.



这是因为默认情况下,法兰绒ConfigMap配置为可在 "Network": "10.244.0.0/16"上使用

您已经使用 --pod-network-cidr=10.17.0.0/16配置了kubeadm,现在需要在法兰绒ConfigMap kube-flannel-cfg中对其进行配置,如下所示:

kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-system
labels:
tier: node
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.17.0.0/16",
"Backend": {
"Type": "vxlan"
}
}

感谢 @kitt的调试帮助。

关于networking - Prometheus Pod无法调用apiserver端点,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61168194/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com