Paul Linchpiner

Force page reclaim in cgroups when their cache usage is higher than the specified threshold

Linux cgroup-aware out-of-memory (OOM) killer accounts RSS, kmem, and cache when calculating memory usage for a cgroup. A process that is running in a cgroup cannot directly control its cache usage. It is a good practice in Kubernetes to set a memory limit for containers. However, even if your program does not consume more than the limit, OOM killer can still kill your container if the total usage (RSS+cache) is bigger than the limit. To address this issue, I have created a simple tool called cgroup-memory-manager. You can run it as a separate process on a Kubernetes worker node, or as a DaemonSet in Kubernetes itself. This program periodically scans all the child cgroups of the specified parent cgroup and analyzes their memory consumption. When cgroup cache usage is higher than the specified threshold, it triggers a forced page reclaim for that cgroup, but not more than once in the specified time frame. https://github.com/linchpiner/cgroup-memory-manager

Paul Linchpiner

Search This Blog

Posts

Force page reclaim in cgroups when their cache usage is higher than the specified threshold

Rust is #20 in TIOBE Index for June 2020