3

How do I enable CLONE_NEWUSER in a more fine-grained fashion compared to just kernel.unprivileged_userns_clone?

I want to keep kernel API attack surface manageable by keeping new and complicated things like non-root CAP_SYS_ADMIN or BPF disabled, but also selectively allow it for some specific programs.

For example, chrome-sandbox wants either CLOSE_NEWUSER or suid-root for proper operation, but I don't want all the programs to be able to use such complicated tricks, only a handful of approved ones.

Vi.
  • 5,528
  • 7
  • 34
  • 68
  • Update: I believe there is a way to do this with AppArmor now. I'll edit my answer later to give details. – forest Jun 06 '23 at 10:48

1 Answers1

3

Without creating a custom kernel patch, this isn't possible. Note that this particular Debian-specific sysctl is deprecated. The way to disable user namespaces is user.max_user_namespaces = 0.

A new user namespace is created by kernel/user_namespace.c:create_user_ns(). There are several checks that occur prior to allowing the creation of a new namespace, but nothing indicates the ability to control this on a per-file or per-user basis. It's unfortunate, but many kernel developers don't understand the risk behind enabling unprivileged user namespaces on a global basis.

A sample (untested!) patch to allow only UID 1234 to create a new namespace in kernel 6.0:

--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -86,6 +86,10 @@ int create_user_ns(struct cred *new)
    struct ucounts *ucounts;
    int ret, i;
 
+   ret = -EPERM;
+   if (!uid_eq(current_uid(), KUIDT_INIT(1234)))
+       goto fail;
+
    ret = -ENOSPC;
    if (parent_ns->level > 32)
        goto fail;
forest
  • 2,585
  • 15
  • 27