no_new_privs.rst 2.89 KB
Newer Older
Kees Cook's avatar
Kees Cook committed
1 2 3 4
======================
No New Privileges Flag
======================

5 6 7 8 9 10 11
The execve system call can grant a newly-started program privileges that
its parent did not have.  The most obvious examples are setuid/setgid
programs and file capabilities.  To prevent the parent program from
gaining these privileges as well, the kernel and user code must be
careful to prevent the parent from doing anything that could subvert the
child.  For example:

Kees Cook's avatar
Kees Cook committed
12
 - The dynamic loader handles ``LD_*`` environment variables differently if
13 14 15
   a program is setuid.

 - chroot is disallowed to unprivileged processes, since it would allow
Kees Cook's avatar
Kees Cook committed
16
   ``/etc/passwd`` to be replaced from the point of view of a process that
17 18 19 20
   inherited chroot.

 - The exec code has special handling for ptrace.

Kees Cook's avatar
Kees Cook committed
21
These are all ad-hoc fixes.  The ``no_new_privs`` bit (since Linux 3.5) is a
22 23
new, generic mechanism to make it safe for a process to modify its
execution environment in a manner that persists across execve.  Any task
Kees Cook's avatar
Kees Cook committed
24 25
can set ``no_new_privs``.  Once the bit is set, it is inherited across fork,
clone, and execve and cannot be unset.  With ``no_new_privs`` set, ``execve()``
26 27 28 29 30 31
promises not to grant the privilege to do anything that could not have
been done without the execve call.  For example, the setuid and setgid
bits will no longer change the uid or gid; file capabilities will not
add to the permitted set, and LSMs will not relax constraints after
execve.

Kees Cook's avatar
Kees Cook committed
32 33 34
To set ``no_new_privs``, use::

    prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
35 36

Be careful, though: LSMs might also not tighten constraints on exec
Kees Cook's avatar
Kees Cook committed
37 38
in ``no_new_privs`` mode.  (This means that setting up a general-purpose
service launcher to set ``no_new_privs`` before execing daemons may
39 40
interfere with LSM-based sandboxing.)

Kees Cook's avatar
Kees Cook committed
41 42 43
Note that ``no_new_privs`` does not prevent privilege changes that do not
involve ``execve()``.  An appropriately privileged task can still call
``setuid(2)`` and receive SCM_RIGHTS datagrams.
44

Kees Cook's avatar
Kees Cook committed
45
There are two main use cases for ``no_new_privs`` so far:
46 47 48 49

 - Filters installed for the seccomp mode 2 sandbox persist across
   execve and can change the behavior of newly-executed programs.
   Unprivileged users are therefore only allowed to install such filters
Kees Cook's avatar
Kees Cook committed
50
   if ``no_new_privs`` is set.
51

Kees Cook's avatar
Kees Cook committed
52
 - By itself, ``no_new_privs`` can be used to reduce the attack surface
53
   available to an unprivileged user.  If everything running with a
Kees Cook's avatar
Kees Cook committed
54
   given uid has ``no_new_privs`` set, then that uid will be unable to
55 56
   escalate its privileges by directly attacking setuid, setgid, and
   fcap-using binaries; it will need to compromise something without the
Kees Cook's avatar
Kees Cook committed
57
   ``no_new_privs`` bit set first.
58 59

In the future, other potentially dangerous kernel features could become
Kees Cook's avatar
Kees Cook committed
60 61 62
available to unprivileged tasks if ``no_new_privs`` is set.  In principle,
several options to ``unshare(2)`` and ``clone(2)`` would be safe when
``no_new_privs`` is set, and ``no_new_privs`` + ``chroot`` is considerable less
63
dangerous than chroot by itself.