• Aleksa Sarai's avatar
    cgroup: implement the PIDs subsystem · 49b786ea
    Aleksa Sarai authored
    Adds a new single-purpose PIDs subsystem to limit the number of
    tasks that can be forked inside a cgroup. Essentially this is an
    implementation of RLIMIT_NPROC that applies to a cgroup rather than a
    process tree.
    However, it should be noted that organisational operations (adding and
    removing tasks from a PIDs hierarchy) will *not* be prevented. Rather,
    the number of tasks in the hierarchy cannot exceed the limit through
    forking. This is due to the fact that, in the unified hierarchy, attach
    cannot fail (and it is not possible for a task to overcome its PIDs
    cgroup policy limit by attaching to a child cgroup -- even if migrating
    mid-fork it must be able to fork in the parent first).
    PIDs are fundamentally a global resource, and it is possible to reach
    PID exhaustion inside a cgroup without hitting any reasonable kmemcg
    policy. Once you've hit PID exhaustion, you're only in a marginally
    better state than OOM. This subsystem allows PID exhaustion inside a
    cgroup to be prevented.
    Signed-off-by: default avatarAleksa Sarai <cyphar@cyphar.com>
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>