gpt4 book ai didi

kubernetes - 分析 Kubernetes pod OOMKilled

转载 作者:行者123 更新时间:2023-12-02 18:50:10 25 4
gpt4 key购买 nike

我们的 K8s Pod 上发生了 OOMKilled 事件。我们希望在发生此类事件时在 pod 被驱逐之前运行 native 内存分析命令。是否可以添加这样的钩子(Hook)?

更具体地说:我们使用 -XX:NativeMemoryTracking=summary 运行JVM 标志。我们要运行jcmd <pid> VM.native_memory summary.diff就在 Pod 驱逐之前查看导致 OOM 的原因。

最佳答案

看起来几乎不可能处理。

基于 answer on Github关于在 OMM Kill 上优雅停止:

It is not possible to change OOM behavior currently. Kubernetes (or runtime) could provide your container a signal whenever your container is close to its memory limit. This will be on a best effort basis though because memory spikes might not be handled on time.

这里来自 official documentation :

If the node experiences a system OOM (out of memory) event prior to the kubelet is able to reclaim memory, the node depends on the oom_killer to respond. The kubelet sets a oom_score_adj value for each container based on the quality of service for the Pod.

因此,正如您所知,您没有太多机会以某种方式处理它。这是大号article关于OOM的处理,这里只讲一小部分,关于内存 Controller 内存不足的处理:

Unfortunately, there may not be much else that this process can do to respond to an OOM situation. If it has locked its text into memory with mlock() or mlockall(), or it is already resident in memory, it is now aware that the memory controller is out of memory. It can't do much of anything else, though, because most operations of interest require the allocation of more memory.

我唯一能提供的就是从 cAdvisor 获取数据(在这里您可以获取 OOM Killer 事件)或从 Kubernetes API 中获取,并在您看到内存即将耗尽的指标时运行您的命令。我不确定您收到 OOM Killer 事件后是否有时间做某事。

关于kubernetes - 分析 Kubernetes pod OOMKilled,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49364568/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com