- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我注意到我们的 gke 集群系统 pod (gke-metrics-agent) 内存不足。我试图编辑 daemonset yaml 文件以将内存请求增加到 200Mi 并将内存限制增加到 200Mi。但是,它不允许我应用它。它像以前一样使用默认值重新创建,即 50Mi。 pod status image
请帮我增加gke-metrics-agent的内存资源
最佳答案
一般CrashLoopBackOff
表示容器在重启后反复崩溃。可以关注documentation解决 CrashLoopBackOff
问题。
限制 gke-metric-agent 的 OOM 终止的可能解决方法是增加 gke-metric-agent pod 的内存限制。这可以通过禁用 GKE 监控并使用自定义指标代理 list 将 gke-metric-agent 部署到集群来完成。这将允许您调整 gke-metric-agent 的内存资源以阻止它被杀死。
为此,您可以按照以下步骤操作:
CLUSTER=<cluster_name>
PROJECT=<project>
LOCATION=<location>
gcloud container clusters update $CLUSTER --zone=$LOCATION --project=$PROJECT --monitoring-service=none --logging-service=logging.googleapis.com/kubernetes
sed -u -e's/{{.ClusterName}}/'${CLUSTER}'/g' -e's/{{.Location}}/'${LOCATION}'/g' metrics-agent.yaml | kubectl apply -f -
---
apiVersion: v1
kind: ConfigMap
metadata:
name: gke-metrics-agent-conf
namespace: default
data:
gke-metrics-agent-config: |
receivers:
prometheus:
use_start_time_metric: true
config:
scrape_configs:
- job_name: "kubelet"
scrape_interval: 60s
static_configs:
- targets: ["$KUBELET_HOST:10255"]
metric_relabel_configs:
- source_labels: [ __name__ ]
target_label: gke_component_name
replacement: "nodes/kubelet"
- job_name: "kubelet-prober"
scrape_interval: 60s
static_configs:
- targets: ["$KUBELET_HOST:10255"]
metrics_path: /metrics/probes
metric_relabel_configs:
- source_labels: [__name__]
regex: "prober_probe_total|process_start_time_seconds"
action: keep
- source_labels: [ __name__ ]
target_label: gke_component_name
replacement: "nodes/kubelet"
- job_name: "addons"
scrape_interval: 60s
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- kube-system
selectors:
- role: pod
field: "spec.nodeName=$NODE_NAME"
relabel_configs:
- source_labels: [ __meta_kubernetes_pod_container_port_name ]
regex: ".*metrics"
action: keep
- source_labels: [ __meta_kubernetes_pod_annotationpresent_components_gke_io_component_name ]
regex: true
action: keep
- source_labels: [ __meta_kubernetes_pod_annotationpresent_monitoring_gke_io_path, __meta_kubernetes_pod_annotation_monitoring_gke_io_path ]
regex: "true;(.*)"
target_label: __metrics_path__
- source_labels: [ __meta_kubernetes_pod_name ]
target_label: pod
- source_labels: [ __meta_kubernetes_pod_container_name ]
target_label: container
- source_labels: [ __meta_kubernetes_namespace ]
target_label: namespace
- source_labels: [ __meta_kubernetes_pod_annotation_components_gke_io_component_name ]
target_label: gke_component_name
replacement: "addons/${ARG1}"
- source_labels: [ gke_component_name ]
target_label: gke_component_name
regex: "(.*)-(.*)"
replacement: "${ARG1}_${ARG2}"
- source_labels: [ gke_component_name ]
target_label: gke_component_name
regex: "(.*)-(.*)"
replacement: "${ARG1}_${ARG2}"
- job_name: "coredns"
scrape_interval: 60s
static_configs:
- targets: ["$KUBELET_HOST:9253"]
metric_relabel_configs:
- source_labels: [ __name__ ]
target_label: gke_component_name
replacement: "nodes/coredns"
- job_name: "coredns-nodecache"
scrape_interval: 60s
static_configs:
- targets: ["$KUBELET_HOST:9353"]
metric_relabel_configs:
- source_labels: [ __name__ ]
target_label: gke_component_name
replacement: "nodes/coredns"
- job_name: "node"
scrape_interval: 60s
static_configs:
- targets: ["$KUBELET_HOST:10231"]
metric_relabel_configs:
- source_labels: [ __name__ ]
target_label: gke_component_name
replacement: "net/cluster/node"
kubenode:
endpoint: "http://$KUBELET_HOST:10255"
scrape_interval: 60s
cluster_name: {{.ClusterName}}
location: {{.Location}}
node_name: "$NODE_NAME"
kubernetes_service_host: "$KUBERNETES_SERVICE_HOST"
exporters:
stackdriver:
endpoint: monitoring.googleapis.com:443
skip_create_metric_descriptor: true
processors:
resource:
type: "host"
labels:
cloud.zone: {{.Location}}
host.name: "$NODE_NAME"
k8s.cluster.name: {{.ClusterName}}
metrics_export:
common_prefix: "kubernetes.io/internal"
detect_container_metrics: true
allowed_labels:
- "project"
- "location"
- "cluster_name"
- "node_name"
- "namespace"
- "pod"
- "container"
export_map:
- "kubernetes.io/internal/nodes/kubelet/process_start_time_seconds":
drop: true
- "kubernetes.io/internal/nodes/kubelet/kubelet_docker_operations_total":
allowed_labels:
- "operation_type"
export_name: "kubernetes.io/internal/nodes/kubelet/docker_operations_total"
export_as_int: true
- "kubernetes.io/internal/nodes/kubelet/kubelet_docker_operations_errors_total":
allowed_labels:
- "operation_type"
export_name: "kubernetes.io/internal/nodes/kubelet/docker_operations_errors_total"
export_as_int: true
- "kubernetes.io/internal/nodes/kubelet/kubelet_runtime_operations_total":
allowed_labels:
- "operation_type"
export_name: "kubernetes.io/internal/nodes/kubelet/runtime_operations_total"
export_as_int: true
- "kubernetes.io/internal/nodes/kubelet/kubelet_runtime_operations_errors_total":
allowed_labels:
- "operation_type"
export_name: "kubernetes.io/internal/nodes/kubelet/runtime_operations_errors_total"
export_as_int: true
- "kubernetes.io/internal/nodes/kubelet/rest_client_requests_total":
allowed_labels:
- "code"
- "method"
- "host"
export_as_int: true
- "kubernetes.io/internal/nodes/kubelet/storage_operation_duration_seconds":
allowed_labels:
- "volume_plugin"
- "operation_name"
- "kubernetes.io/internal/nodes/kubelet/kubelet_network_plugin_operations_duration_seconds":
allowed_labels:
- "operation_type"
export_name: "kubernetes.io/internal/nodes/kubelet/network_plugin_operations_duration_seconds"
- "kubernetes.io/internal/nodes/kubelet/storage_operation_errors_total":
allowed_labels:
- "volume_plugin"
- "operation_name"
export_as_int: true
- "kubernetes.io/internal/nodes/kubelet/storage_operation_status_count":
allowed_labels:
- "volume_plugin"
- "operation_name"
- "status"
export_as_int: true
- "kubernetes.io/internal/nodes/kubelet/prober_probe_total":
allowed_labels:
- "container"
- "namespace"
- "pod"
- "pod_uid"
- "result"
- "probe_type"
export_as_int: true
is_container_metric: true
- "kubernetes.io/internal/nodes/coredns/process_start_time_seconds":
drop: true
- "kubernetes.io/internal/nodes/coredns/coredns_cache_drops_total":
allowed_labels:
- "server"
export_name: "kubernetes.io/internal/nodes/coredns/cache_drops_total"
- "kubernetes.io/internal/nodes/coredns/coredns_cache_hits_total":
allowed_labels:
- "server"
- "type"
export_name: "kubernetes.io/internal/nodes/coredns/cache_hits_total"
export_as_int: true
- "kubernetes.io/internal/nodes/coredns/coredns_cache_misses_total":
allowed_labels:
- "server"
export_name: "kubernetes.io/internal/nodes/coredns/cache_misses_total"
export_as_int: true
- "kubernetes.io/internal/nodes/coredns/coredns_cache_prefetch_total":
allowed_labels:
- "server"
export_name: "kubernetes.io/internal/nodes/coredns/cache_prefetch_total"
- "kubernetes.io/internal/nodes/coredns/coredns_cache_size":
allowed_labels:
- "server"
- "type"
export_name: "kubernetes.io/internal/nodes/coredns/cache_size"
export_as_int: true
- "kubernetes.io/internal/nodes/coredns/coredns_dns_request_count_total":
allowed_labels:
- "family"
- "proto"
- "server"
- "zone"
export_name: "kubernetes.io/internal/nodes/coredns/dns_request_count_total"
export_as_int: true
- "kubernetes.io/internal/nodes/coredns/coredns_dns_request_duration_seconds":
allowed_labels:
- "server"
- "zone"
export_name: "kubernetes.io/internal/nodes/coredns/dns_request_duration_seconds"
- "kubernetes.io/internal/nodes/coredns/coredns_dns_request_type_count_total":
allowed_labels:
- "server"
- "type"
- "zone"
export_name: "kubernetes.io/internal/nodes/coredns/dns_request_type_count_total"
export_as_int: true
- "kubernetes.io/internal/nodes/coredns/coredns_dns_response_rcode_count_total":
allowed_labels:
- "rcode"
- "server"
- "zone"
export_name: "kubernetes.io/internal/nodes/coredns/dns_response_rcode_count_total"
export_as_int: true
- "kubernetes.io/internal/nodes/coredns/coredns_forward_healthcheck_failure_count_total":
allowed_labels:
- "to"
export_name: "kubernetes.io/internal/nodes/coredns/forward_healthcheck_failure_count_total"
export_as_int: true
- "kubernetes.io/internal/nodes/coredns/coredns_forward_request_count_total":
allowed_labels:
- "to"
export_name: "kubernetes.io/internal/nodes/coredns/forward_request_count_total"
export_as_int: true
- "kubernetes.io/internal/nodes/coredns/coredns_forward_request_duration_seconds":
allowed_labels:
- "to"
export_name: "kubernetes.io/internal/nodes/coredns/forward_request_duration_seconds"
- "kubernetes.io/internal/nodes/coredns/coredns_forward_response_rcode_count_total":
allowed_labels:
- "rcode"
- "to"
export_name: "kubernetes.io/internal/nodes/coredns/forward_response_rcode_count_total"
export_as_int: true
- "kubernetes.io/internal/nodes/coredns/coredns_forward_sockets_open":
allowed_labels:
- "to"
export_name: "kubernetes.io/internal/nodes/coredns/forward_sockets_open"
export_as_int: true
- "kubernetes.io/internal/nodes/coredns/coredns_health_request_duration_seconds":
allowed_labels: []
export_name: "kubernetes.io/internal/nodes/coredns/health_request_duration_seconds"
export_as_int: true
- "kubernetes.io/internal/nodes/coredns/coredns_panic_count_total":
allowed_labels: []
export_name: "kubernetes.io/internal/nodes/coredns/dns_panic_count_total"
export_as_int: true
- "kubernetes.io/internal/nodes/coredns/nodecache_setup_errors_total":
allowed_labels:
- "errortype"
export_name: "kubernetes.io/internal/nodes/coredns/nodecache_setup_errors_total"
- "kubernetes.io/internal/net/cluster/node/process_start_time_seconds":
drop: true
- "kubernetes.io/internal/net/cluster/node/conntrack_entries":
allowed_labels: []
export_as_int: true
- "kubernetes.io/internal/net/cluster/node/conntrack_error_count":
allowed_labels:
- "type"
export_as_int: true
- "kubernetes.io/internal/net/cluster/node/num_inuse_sockets":
allowed_labels:
- "protocol"
export_as_int: true
- "kubernetes.io/internal/net/cluster/node/num_tw_sockets":
allowed_labels: []
export_as_int: true
- "kubernetes.io/internal/net/cluster/node/socket_memory":
allowed_labels: []
export_as_int: true
- "kubernetes.io/internal/addons/kubedns/process_start_time_seconds":
drop: true
- "kubernetes.io/internal/addons/kubedns/skydns_skydns_dns_request_count_total":
allowed_labels:
- "system"
export_name: "kubernetes.io/internal/addons/kubedns/skydns_dns_request_count_total"
export_as_int: true
- "kubernetes.io/internal/addons/kubedns/skydns_skydns_dns_request_duration_seconds":
allowed_labels:
- "system"
export_name: "kubernetes.io/internal/addons/kubedns/skydns_dns_request_duration_seconds"
- "kubernetes.io/internal/addons/kubedns/skydns_skydns_dns_response_size_bytes":
allowed_labels:
- "system"
export_name: "kubernetes.io/internal/addons/kubedns/skydns_dns_response_size_bytes"
- "kubernetes.io/internal/addons/kubedns/skydns_skydns_dns_error_count_total":
allowed_labels:
- "system"
- "cause"
export_name: "kubernetes.io/internal/addons/kubedns/skydns_dns_error_count_total"
export_as_int: true
- "kubernetes.io/internal/addons/kubedns/skydns_skydns_dns_cachemiss_count_total":
allowed_labels:
- "cache"
export_name: "kubernetes.io/internal/addons/kubedns/skydns_dns_cachemiss_count_total"
export_as_int: true
extensions:
observability:
endpoint: monitoring.googleapis.com:443
prefix: "kubernetes.io/internal/addons/gke_otelsvc"
resource:
type: "k8s_container"
labels:
location: {{.Location}}
cluster_name: {{.ClusterName}}
pod_name: "$POD_NAME"
namespace_name: "$POD_NAMESPACE"
container_name: "gke-metrics-agent"
service:
extensions:
- observability
pipelines:
metrics/kube:
receivers:
- kubenode
exporters:
- stackdriver
metrics/prom:
receivers:
- prometheus
processors:
- resource
- metrics_export
exporters:
- stackdriver
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: gke-metrics-agent
namespace: default
---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
annotations:
apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
kubernetes.io/description: Policy used by the gke-metrics-agent addon.
seccomp.security.alpha.kubernetes.io/allowedProfileNames: runtime/default,docker/default
seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
name: gce.gke-metrics-agent
labels:
kubernetes.io/cluster-service: 'true'
spec:
privileged: false
allowPrivilegeEscalation: false
volumes:
- 'hostPath'
- 'secret'
- 'configMap'
allowedHostPaths:
- pathPrefix: /etc/ssl/certs
hostNetwork: true
hostIPC: false
hostPID: false
runAsUser:
rule: 'RunAsAny'
seLinux:
rule: 'RunAsAny'
supplementalGroups:
rule: 'RunAsAny'
fsGroup:
rule: 'RunAsAny'
readOnlyRootFilesystem: false
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: gke-metrics-agent
rules:
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- pods
verbs:
- list
- watch
- apiGroups:
- policy
resourceNames:
- gce.gke-metrics-agent
resources:
- podsecuritypolicies
verbs:
- use
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: gke-metrics-agent
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: gke-metrics-agent
subjects:
- kind: ServiceAccount
name: gke-metrics-agent
namespace: default
---
# linux deployment
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: gke-metrics-agent
namespace: default
labels:
k8s-app: gke-metrics-agent
component: gke-metrics-agent
spec:
selector:
matchLabels:
k8s-app: gke-metrics-agent
component: gke-metrics-agent
template:
metadata:
labels:
k8s-app: gke-metrics-agent
component: gke-metrics-agent
spec:
nodeSelector:
kubernetes.io/os: linux
tolerations:
- effect: NoExecute
operator: Exists
- effect: NoSchedule
operator: Exists
hostNetwork: true
serviceAccount: gke-metrics-agent
containers:
- name: gke-metrics-agent
image: "gcr.io/gke-release/gke-metrics-agent:0.1.3-gke.0"
resources:
requests:
memory: 50Mi
cpu: 3m
limits:
memory: 70Mi
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: KUBELET_HOST
value: "127.0.0.1"
- name: ARG1
value: "${1}"
- name: ARG2
value: "${2}"
- name: WINDOWS_JOB_ACTION
value: "drop"
command:
- "/otelsvc"
- "--config=/conf/gke-metrics-agent-config.yaml"
- "--metrics-level=NONE"
volumeMounts:
- name: gke-metrics-agent-config-vol
mountPath: /conf
- name: ssl-certs
mountPath: /etc/ssl/certs
readOnly: true
volumes:
- configMap:
name: gke-metrics-agent-conf
items:
- key: gke-metrics-agent-config
path: gke-metrics-agent-config.yaml
name: gke-metrics-agent-config-vol
- name: ssl-certs
hostPath:
path: /etc/ssl/certs
---
# windows deployment
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: gke-metrics-agent-windows
namespace: default
labels:
k8s-app: gke-metrics-agent
component: gke-metrics-agent
spec:
selector:
matchLabels:
k8s-app: gke-metrics-agent
component: gke-metrics-agent
template:
metadata:
labels:
k8s-app: gke-metrics-agent
component: gke-metrics-agent
spec:
nodeSelector:
kubernetes.io/os: windows
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
- effect: NoSchedule
key: node.kubernetes.io/os
operator: Equal
value: windows
serviceAccount: gke-metrics-agent
containers:
- name: gke-metrics-agent
image: "gke.io/gke-release/gke-metrics-agent-windows:0.3.1-gke.2"
resources:
requests:
cpu: 5m
memory: 200Mi
limits:
memory: 200Mi
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: KUBELET_HOST
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: KUBERNETES_SERVICE_HOST
value: "kubernetes.default.svc.cluster.local"
- name: ARG1
value: "${1}"
- name: ARG2
value: "${2}"
- name: WINDOWS_JOB_ACTION
value: "keep"
command:
- "c:\\otelsvc.exe"
- "--config=/conf/gke-metrics-agent-config.yaml"
- "--metrics-level=NONE"
volumeMounts:
- name: gke-metrics-agent-config-vol
mountPath: /conf
volumes:
- configMap:
name: gke-metrics-agent-conf
items:
- key: gke-metrics-agent-config
path: gke-metrics-agent-config.yaml
name: gke-metrics-agent-config-vol
注意:您可以根据需要编辑 linux 部署的内存限制。
sed -u -e's/{{.ClusterName}}/'${CLUSTER}'/g' -e's/{{.Location}}/'${LOCATION}'/g' metrics-agent.yaml | kubectl delete -f -
或
kubectl delete ds gke-metrics-agent
Kubectl delete ds gke-metrics-agent-windows
kubectl delete cm gke-metrics-agent-conf
kubectl delete sa gke-metrics-agent
gcloud container clusters update $CLUSTER --zone=$LOCATION --project=$PROJECT --monitoring-service=monitoring.googleapis.com/kubernetes --logging-service=logging.googleapis.com/kubernetes
关于google-kubernetes-engine - GKE 系统 pod gke-metrics-agent OOMKilld,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67668808/
这里的这个问题对 updating Google Sheets charts linked to Google slides 有一个简洁的解决方案. function onOpen() { var
我正在尝试将 Google 表单添加到 Google 类作业中,但似乎不可能。 首先,它在这里 ( https://developers.google.com/classroom/reference/
出于某种原因,无论我做什么以及我如何尝试,这个日期格式化程序都不起作用。工具提示仍然显示错误的格式。你可以试试代码here . 在代码中我必须注释掉 formatter.format(dataTabl
我目前正在使用访问 token 和刷新 token 从 Google Analytics Reporting API (v4) 中提取数据。当我致力于自动从 Google Analytics 中提取数
我已在 Google 云端硬盘中创建了一个文件夹,例如测试一下,放入3个文件 a.jpg, b.jpg, c.jpg 我希望在同一帐户下的 Google 电子表格中访问文件,例如生成图像文件的链接,可
电子表格 A 是欢迎新移民来到我们小镇的团队的主数据源。它里面有大量非常敏感的数据,不能公开,哪怕是一点点。 (我们谈论的是 child 的姓名和出生日期以及他们在哪里上学……保证电子表格 A 的安全
有没有办法在 Google 文档中编写 Google Apps 脚本以从 Google 表格中检索仅限于非空白行的范围并将这些行显示为表格? 我正在寻找一个脚本,用于使用 Google Apps 脚本
有没有办法在 Google 文档中编写 Google Apps 脚本以从 Google 表格中检索仅限于非空白行的范围并将这些行显示为表格? 我正在寻找一个脚本,用于使用 Google Apps 脚本
尝试检索存储在 google firebase 中名为条目的节点下的表单条目,并使用谷歌工作表中的脚本编辑器附加到谷歌工作表。 我已将 FirebaseApp 库添加到谷歌表脚本编辑器。然后我的代码看
是否可以将我的 Web 应用程序的登录限制为仅限 google 组中的帐户? 我不希望每个人都可以使用他们的私有(private) gmail 登录,而只能使用我的 google 组中的用户。 最佳答
我们想使用 Google 自定义搜索实现 Google 附加链接搜索框。在谷歌 documentation , 我发现我们需要包含以下代码来启用附加链接搜索框 { "@context"
我想将特定搜索词的 Google 趋势图表添加到我的 Google Data Studio 报告中,但趋势不是数据源列表中的选项。我也找不到嵌入 JavaScript 的选项。是否可以将趋势图表添加到
是否可以将文件从 Google Drive 复制到 Google Cloud Storage?我想它会非常快,因为两者都在类似的存储系统上。 我还没有看到有关无缝执行此操作的任何方法的任何信息,而无需
之间有什么区别 ga('send', 'pageview', { 'dimension1': 'data goes here' }); 和 ga('set', 'dimension1', 'da
我正在尝试记录每个博客站点作者的点击率。 ga('send', 'pageview'); (in the header with the ga code to track each page) ga(
我设置了 Google Tag Manager 和 2 个数据层变量:一个用于跟踪用户 ID,传递给 Google Analytics 以同步用户 session ,另一个用于跟踪访问者类型。 在使用
我在我们的网站上遇到多个职位发布的问题。 我们在加拿大多个地点提供工作机会。所有职位页面都包含一个“LD+JSON ”职位发布的结构化数据,基于 Google 的职位发布文档: https://dev
公司未使用 Google 套件,使用个人(消费者)帐户(甚至是 Google 帐户)违反公司政策。 需要访问 Google Analytics - 没有 Google 帐户是否可能? 谢谢 最佳答案
我想分析人们使用哪些搜索关键字在 Play 商店中找到我的应用。 那可能吗?我怎么能这样做? 最佳答案 自 2013 年 10 月起,您可以关联您的 Google Analytics(分析)和 Goo
Google Now 和 Google Keep 中基于时间和位置的提醒与 Google Calendar 事件提醒不同。是否有公共(public) API 可以访问 Now 和 Keep 中的这些事
我是一名优秀的程序员,十分优秀!