gpt4 book ai didi

jenkins - 什么可能导致 Kubernetes Jenkins 从 pod 启动和暂停

转载 作者:行者123 更新时间:2023-12-02 12:00:11 25 4
gpt4 key购买 nike

我正在使用 Kubernetes Jenkins 构建项目,但有时当 Jenkins 启动一个 pod 时,它显示正在启动......然后暂停。当我检查日志输出时,它显示 404。

HTTP ERROR 404 Not Found
URI: /computer/default-j07v7/log
STATUS: 404
MESSAGE: Not Found
SERVLET: Stapler
Powered by Jetty:// 9.4.27.v20200227

这个错误看起来像:

Image1

当 pod 挂起并重新启动时,一次又一次。 pod 创建的事件看起来很正常:

Normal  Scheduled               default-scheduler   Successfully assigned infrastructure/default-v7m44 to k8sslave3
Normal Pulled 1 2020-08-16T08:29:36Z 2020-08-16T08:29:36Z kubelet Container image "jenkins/jnlp-slave:3.27-1" already present on machine
Normal Created 1 2020-08-16T08:29:36Z 2020-08-16T08:29:36Z kubelet Created container jnlp
Normal Started 1 2020-08-16T08:29:36Z 2020-08-16T08:29:36Z kubelet Started container jnlp

我应该怎么做才能解决这个问题?尝试了几天,我发现如果我调整 pod templdate 的任何参数,代理会立即更改为暂停状态。如果保持默认,agent应该可以正常启动。这是有线问题,让我感到困惑。这是我的 jenkins master 部署 yaml:

kind: Deployment
apiVersion: apps/v1
metadata:
name: jenkins
namespace: infrastructure
selfLink: /apis/apps/v1/namespaces/infrastructure/deployments/jenkins
uid: 3df24fd6-ffaf-4f17-8b02-a2904cabbf95
resourceVersion: '1707498'
generation: 38
creationTimestamp: '2020-07-18T14:48:47Z'
labels:
app.kubernetes.io/component: jenkins-master
app.kubernetes.io/instance: jenkins
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: jenkins
helm.sh/chart: jenkins-2.4.1
annotations:
deployment.kubernetes.io/revision: '10'
meta.helm.sh/release-name: jenkins
meta.helm.sh/release-namespace: infrastructure
managedFields:
- manager: Go-http-client
operation: Update
apiVersion: apps/v1
time: '2020-08-02T10:08:04Z'
fieldsType: FieldsV1

- manager: dashboard
operation: Update
apiVersion: apps/v1
time: '2020-08-17T14:27:59Z'
fieldsType: FieldsV1
fieldsV1:
'f:spec':
'f:template':
'f:spec':
'f:containers':
'k:{"name":"jenkins"}':
'f:volumeMounts':
'k:{"mountPath":"/usr/bin/docker"}':
.: {}
'f:mountPath': {}
'f:name': {}
'k:{"mountPath":"/var/run/docker.sock"}':
.: {}
'f:mountPath': {}
'f:name': {}
'f:securityContext':
'f:runAsUser': {}
'f:volumes':
'k:{"name":"docker"}':
.: {}
'f:hostPath':
.: {}
'f:path': {}
'f:type': {}
'f:name': {}
'k:{"name":"dockersock"}':
.: {}
'f:hostPath':
.: {}
'f:path': {}
'f:type': {}
'f:name': {}
- manager: kube-controller-manager
operation: Update
apiVersion: apps/v1
time: '2020-08-18T16:14:00Z'
fieldsType: FieldsV1
fieldsV1:
'f:metadata':
'f:annotations':
'f:deployment.kubernetes.io/revision': {}
'f:status':
'f:availableReplicas': {}
'f:conditions':
.: {}
'k:{"type":"Available"}':
.: {}
'f:lastTransitionTime': {}
'f:lastUpdateTime': {}
'f:message': {}
'f:reason': {}
'f:status': {}
'f:type': {}
'k:{"type":"Progressing"}':
.: {}
'f:lastTransitionTime': {}
'f:lastUpdateTime': {}
'f:message': {}
'f:reason': {}
'f:status': {}
'f:type': {}
'f:observedGeneration': {}
'f:readyReplicas': {}
'f:replicas': {}
'f:updatedReplicas': {}
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/component: jenkins-master
app.kubernetes.io/instance: jenkins
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/component: jenkins-master
app.kubernetes.io/instance: jenkins
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: jenkins
helm.sh/chart: jenkins-2.4.1
annotations:
checksum/config: 60990c68bb90ec59c79d56498da29d250d8da13cfbb9c35cad53f0cd789f318b
spec:
volumes:
- name: plugins
emptyDir: {}
- name: tmp
emptyDir: {}
- name: jenkins-config
configMap:
name: jenkins
defaultMode: 420
- name: secrets-dir
emptyDir: {}
- name: plugin-dir
emptyDir: {}
- name: jenkins-home
persistentVolumeClaim:
claimName: jenkins
- name: sc-config-volume
emptyDir: {}
- name: dockersock
hostPath:
path: /var/run/docker.sock
type: ''
- name: docker
hostPath:
path: /usr/bin/docker
type: ''
initContainers:
- name: copy-default-config
image: 'jenkins/jenkins:lts'
command:
- sh
- /var/jenkins_config/apply_config.sh
env:
- name: ADMIN_PASSWORD
valueFrom:
secretKeyRef:
name: jenkins
key: jenkins-admin-password
- name: ADMIN_USER
valueFrom:
secretKeyRef:
name: jenkins
key: jenkins-admin-user
resources:
limits:
cpu: '2'
memory: 4Gi
requests:
cpu: 50m
memory: 256Mi
volumeMounts:
- name: tmp
mountPath: /tmp
- name: jenkins-home
mountPath: /var/jenkins_home
- name: jenkins-config
mountPath: /var/jenkins_config
- name: secrets-dir
mountPath: /usr/share/jenkins/ref/secrets/
- name: plugins
mountPath: /usr/share/jenkins/ref/plugins
- name: plugin-dir
mountPath: /var/jenkins_plugins
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: Always
containers:
- name: jenkins
image: 'jenkins/jenkins:lts'
args:
- '--argumentsRealm.passwd.$(ADMIN_USER)=$(ADMIN_PASSWORD)'
- '--argumentsRealm.roles.$(ADMIN_USER)=admin'
- '--httpPort=8080'
ports:
- name: http
containerPort: 8080
protocol: TCP
- name: slavelistener
containerPort: 50000
protocol: TCP
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: JAVA_OPTS
value: |

-Dcasc.reload.token=$(POD_NAME)
- name: JENKINS_OPTS
- name: JENKINS_SLAVE_AGENT_PORT
value: '50000'
- name: ADMIN_PASSWORD
valueFrom:
secretKeyRef:
name: jenkins
key: jenkins-admin-password
- name: ADMIN_USER
valueFrom:
secretKeyRef:
name: jenkins
key: jenkins-admin-user
- name: CASC_JENKINS_CONFIG
value: /var/jenkins_home/casc_configs
resources:
limits:
cpu: '2'
memory: 4Gi
requests:
cpu: 50m
memory: 256Mi
volumeMounts:
- name: tmp
mountPath: /tmp
- name: jenkins-home
mountPath: /var/jenkins_home
- name: jenkins-config
readOnly: true
mountPath: /var/jenkins_config
- name: secrets-dir
mountPath: /usr/share/jenkins/ref/secrets/
- name: plugin-dir
mountPath: /usr/share/jenkins/ref/plugins/
- name: sc-config-volume
mountPath: /var/jenkins_home/casc_configs
- name: dockersock
mountPath: /var/run/docker.sock
- name: docker
mountPath: /usr/bin/docker
livenessProbe:
httpGet:
path: /login
port: http
scheme: HTTP
initialDelaySeconds: 90
timeoutSeconds: 5
periodSeconds: 10
successThreshold: 1
failureThreshold: 5
readinessProbe:
httpGet:
path: /login
port: http
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: Always
- name: jenkins-sc-config
image: 'kiwigrid/k8s-sidecar:0.1.144'
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: LABEL
value: jenkins-jenkins-config
- name: FOLDER
value: /var/jenkins_home/casc_configs
- name: NAMESPACE
value: infrastructure
- name: REQ_URL
value: >-
http://localhost:8080/reload-configuration-as-code/?casc-reload-token=$(POD_NAME)
- name: REQ_METHOD
value: POST
- name: REQ_RETRY_CONNECT
value: '10'
resources: {}
volumeMounts:
- name: sc-config-volume
mountPath: /var/jenkins_home/casc_configs
- name: jenkins-home
mountPath: /var/jenkins_home
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
serviceAccountName: jenkins
serviceAccount: jenkins
securityContext:
runAsUser: 0
fsGroup: 976
schedulerName: default-scheduler
strategy:
type: Recreate
revisionHistoryLimit: 10
progressDeadlineSeconds: 600
status:
observedGeneration: 38
replicas: 1
updatedReplicas: 1
readyReplicas: 1
availableReplicas: 1
conditions:
- type: Progressing
status: 'True'
lastUpdateTime: '2020-08-17T14:45:20Z'
lastTransitionTime: '2020-08-17T14:45:20Z'
reason: NewReplicaSetAvailable
message: ReplicaSet "jenkins-7454db64f6" has successfully progressed.
- type: Available
status: 'True'
lastUpdateTime: '2020-08-18T16:14:00Z'
lastTransitionTime: '2020-08-18T16:14:00Z'
reason: MinimumReplicasAvailable
message: Deployment has minimum availability.

这是 master pod 中日志输出的一部分:

2020-08-21 16:44:40.381+0000 [id=955]   WARNING i.f.k.c.d.i.WatchConnectionManager$1#onFailure: Exec Failure
java.util.concurrent.RejectedExecutionException: Task okhttp3.RealCall$AsyncCall@2fb3e877 rejected from java.util.concurrent.ThreadPoolExecutor@9ce8b47[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 18]
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
at okhttp3.RealCall$AsyncCall.executeOn(RealCall.java:183)
Caused: java.io.InterruptedIOException: executor rejected
at okhttp3.RealCall$AsyncCall.executeOn(RealCall.java:186)
at okhttp3.Dispatcher.promoteAndExecute(Dispatcher.java:186)
at okhttp3.Dispatcher.enqueue(Dispatcher.java:137)
at okhttp3.RealCall.enqueue(RealCall.java:127)
at okhttp3.internal.ws.RealWebSocket.connect(RealWebSocket.java:193)
at okhttp3.OkHttpClient.newWebSocket(OkHttpClient.java:435)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.runWatch(WatchConnectionManager.java:158)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.access$1200(WatchConnectionManager.java:50)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2$1.execute(WatchConnectionManager.java:321)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$NamedRunnable.run(WatchConnectionManager.java:410)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-08-21 16:44:45.239+0000 [id=33] INFO hudson.slaves.NodeProvisioner#lambda$update$6: default-3393d provisioning successfully completed. We have now 3 computer(s)
2020-08-21 16:44:45.241+0000 [id=2765] INFO o.c.j.p.k.KubernetesLauncher#launch: Created Pod: infrastructure/default-3393d
2020-08-21 16:44:45.302+0000 [id=2826] INFO o.internal.platform.Platform#log: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
2020-08-21 16:44:45.350+0000 [id=2765] INFO o.internal.platform.Platform#log: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
2020-08-21 16:44:55.363+0000 [id=2765] WARNING o.c.j.p.k.KubernetesLauncher#launch: Error in provisioning; agent=KubernetesSlave name: default-3393d, template=PodTemplate{inheritFrom='', name='default', namespace='', hostNetwork=false, activeDeadlineSeconds=10, label='jenkins-jenkins-slave ', serviceAccount='default', nodeSelector='', nodeUsageMode=NORMAL, workspaceVolume=EmptyDirWorkspaceVolume [memory=false], containers=[ContainerTemplate{name='jnlp', image='jenkins/jnlp-slave:3.27-1', workingDir='/home/jenkins', command='/bin/sh -c', args='${computer.jnlpmac} ${computer.name}', resourceRequestCpu='512m', resourceRequestMemory='512Mi', resourceLimitCpu='512m', resourceLimitMemory='512Mi', envVars=[ContainerEnvVar [getValue()=http://jenkins.infrastructure.svc.cluster.local:8080, getKey()=JENKINS_URL]], livenessProbe=org.csanchez.jenkins.plugins.kubernetes.ContainerLivenessProbe@5187faf3}]}
java.lang.IllegalStateException: Pod has terminated containers: infrastructure/default-3393d (jnlp)
at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.periodicAwait(AllContainersRunningPodWatcher.java:133)
at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.periodicAwait(AllContainersRunningPodWatcher.java:154)
at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.await(AllContainersRunningPodWatcher.java:94)
at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:140)
at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:296)
at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-08-21 16:44:55.363+0000 [id=2765] INFO o.c.j.p.k.KubernetesSlave#_terminate: Terminating Kubernetes instance for agent default-3393d
Terminated Kubernetes instance for agent infrastructure/default-3393d
Disconnected computer default-3393d
2020-08-21 16:44:55.383+0000 [id=2765] INFO o.c.j.p.k.KubernetesSlave#deleteSlavePod: Terminated Kubernetes instance for agent infrastructure/default-3393d
2020-08-21 16:44:55.383+0000 [id=2765] INFO o.c.j.p.k.KubernetesSlave#_terminate: Disconnected computer default-3393d
2020-08-21 16:45:05.198+0000 [id=42] INFO o.c.j.p.k.KubernetesCloud#provision: Excess workload after pending Kubernetes agents: 1
2020-08-21 16:45:05.198+0000 [id=42] INFO o.c.j.p.k.KubernetesCloud#provision: Template for label null: default
2020-08-21 16:45:12.383+0000 [id=955] WARNING i.f.k.c.d.i.WatchConnectionManager$1#onFailure: Exec Failure
java.util.concurrent.RejectedExecutionException: Task okhttp3.RealCall$AsyncCall@6c6c7a45 rejected from java.util.concurrent.ThreadPoolExecutor@9ce8b47[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 18]
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
at okhttp3.RealCall$AsyncCall.executeOn(RealCall.java:183)
Caused: java.io.InterruptedIOException: executor rejected
at okhttp3.RealCall$AsyncCall.executeOn(RealCall.java:186)
at okhttp3.Dispatcher.promoteAndExecute(Dispatcher.java:186)
at okhttp3.Dispatcher.enqueue(Dispatcher.java:137)
at okhttp3.RealCall.enqueue(RealCall.java:127)
at okhttp3.internal.ws.RealWebSocket.connect(RealWebSocket.java:193)
at okhttp3.OkHttpClient.newWebSocket(OkHttpClient.java:435)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.runWatch(WatchConnectionManager.java:158)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.access$1200(WatchConnectionManager.java:50)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2$1.execute(WatchConnectionManager.java:321)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$NamedRunnable.run(WatchConnectionManager.java:410)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-08-21 16:45:15.236+0000 [id=2765] INFO o.c.j.p.k.KubernetesLauncher#launch: Created Pod: infrastructure/default-03q6x
2020-08-21 16:45:15.252+0000 [id=36] INFO hudson.slaves.NodeProvisioner#lambda$update$6: default-03q6x provisioning successfully completed. We have now 3 computer(s)
2020-08-21 16:45:15.314+0000 [id=2824] INFO o.internal.platform.Platform#log: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
2020-08-21 16:45:15.381+0000 [id=2765] INFO o.internal.platform.Platform#log: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
2020-08-21 16:45:25.390+0000 [id=2765] WARNING o.c.j.p.k.KubernetesLauncher#launch: Error in provisioning; agent=KubernetesSlave name: default-03q6x, template=PodTemplate{inheritFrom='', name='default', namespace='', hostNetwork=false, activeDeadlineSeconds=10, label='jenkins-jenkins-slave ', serviceAccount='default', nodeSelector='', nodeUsageMode=NORMAL, workspaceVolume=EmptyDirWorkspaceVolume [memory=false], containers=[ContainerTemplate{name='jnlp', image='jenkins/jnlp-slave:3.27-1', workingDir='/home/jenkins', command='/bin/sh -c', args='${computer.jnlpmac} ${computer.name}', resourceRequestCpu='512m', resourceRequestMemory='512Mi', resourceLimitCpu='512m', resourceLimitMemory='512Mi', envVars=[ContainerEnvVar [getValue()=http://jenkins.infrastructure.svc.cluster.local:8080, getKey()=JENKINS_URL]], livenessProbe=org.csanchez.jenkins.plugins.kubernetes.ContainerLivenessProbe@5187faf3}]}
java.lang.IllegalStateException: Pod has terminated containers: infrastructure/default-03q6x (jnlp)
at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.periodicAwait(AllContainersRunningPodWatcher.java:133)
at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.periodicAwait(AllContainersRunningPodWatcher.java:154)
at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.await(AllContainersRunningPodWatcher.java:94)
at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:140)
at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:296)
at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-08-21 16:45:25.391+0000 [id=2765] INFO o.c.j.p.k.KubernetesSlave#_terminate: Terminating Kubernetes instance for agent default-03q6x
Terminated Kubernetes instance for agent infrastructure/default-03q6x

现在这是我的 kubernetes 云模板快照:

enter image description here

这是 pod 模板配置:

enter image description here

最佳答案

我建议像这样做一些改变

  1. jenkins 隧道 的所有内容留空。 Jenkins 会自动将其拾取。

  2. 如果您在 kubernetes 集群中部署了这个 jenkins 实例,那么请为 jenkins_url 使用内部地址,例如 http://jenkins.infrastructure.svc 我假设您的 jenkins服务名称是 jenkins 并且它是 ClusterIP

关于jenkins - 什么可能导致 Kubernetes Jenkins 从 pod 启动和暂停,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63434779/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com