Re: [PATCHv3 0/2] capability controlled user-namespaces

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 27, 2017 at 12:23 PM, Michael Kerrisk (man-pages)
<mtk.manpages@xxxxxxxxx> wrote:
> Hello Mahesh,
>
> On 27 December 2017 at 18:09, Mahesh Bandewar (महेश बंडेवार)
> <maheshb@xxxxxxxxxx> wrote:
>> Hello James,
>>
>> Seems like I missed your name to be added into the review of this
>> patch series. Would you be willing be pull this into the security
>> tree? Serge Hallyn has already ACKed it.
>
> We seem to have no formal documentation/specification of this feature.
> I think that should be written up before this patch goes into
> mainline...
>
absolutely. I have added enough information into the Documentation dir
relevant to this feature (please look at the  individual patches),
that could be used. I could help if needed.

thanks,
--mahesh..

> Cheers,
>
> Michael
>
>
>>
>> On Tue, Dec 5, 2017 at 2:30 PM, Mahesh Bandewar <mahesh@xxxxxxxxxxxx> wrote:
>>> From: Mahesh Bandewar <maheshb@xxxxxxxxxx>
>>>
>>> TL;DR version
>>> -------------
>>> Creating a sandbox environment with namespaces is challenging
>>> considering what these sandboxed processes can engage into. e.g.
>>> CVE-2017-6074, CVE-2017-7184, CVE-2017-7308 etc. just to name few.
>>> Current form of user-namespaces, however, if changed a bit can allow
>>> us to create a sandbox environment without locking down user-
>>> namespaces.
>>>
>>> Detailed version
>>> ----------------
>>>
>>> Problem
>>> -------
>>> User-namespaces in the current form have increased the attack surface as
>>> any process can acquire capabilities which are not available to them (by
>>> default) by performing combination of clone()/unshare()/setns() syscalls.
>>>
>>>     #define _GNU_SOURCE
>>>     #include <stdio.h>
>>>     #include <sched.h>
>>>     #include <netinet/in.h>
>>>
>>>     int main(int ac, char **av)
>>>     {
>>>         int sock = -1;
>>>
>>>         printf("Attempting to open RAW socket before unshare()...\n");
>>>         sock = socket(AF_INET6, SOCK_RAW, IPPROTO_RAW);
>>>         if (sock < 0) {
>>>             perror("socket() SOCK_RAW failed: ");
>>>         } else {
>>>             printf("Successfully opened RAW-Sock before unshare().\n");
>>>             close(sock);
>>>             sock = -1;
>>>         }
>>>
>>>         if (unshare(CLONE_NEWUSER | CLONE_NEWNET) < 0) {
>>>             perror("unshare() failed: ");
>>>             return 1;
>>>         }
>>>
>>>         printf("Attempting to open RAW socket after unshare()...\n");
>>>         sock = socket(AF_INET6, SOCK_RAW, IPPROTO_RAW);
>>>         if (sock < 0) {
>>>             perror("socket() SOCK_RAW failed: ");
>>>         } else {
>>>             printf("Successfully opened RAW-Sock after unshare().\n");
>>>             close(sock);
>>>             sock = -1;
>>>         }
>>>
>>>         return 0;
>>>     }
>>>
>>> The above example shows how easy it is to acquire NET_RAW capabilities
>>> and once acquired, these processes could take benefit of above mentioned
>>> or similar issues discovered/undiscovered with malicious intent. Note
>>> that this is just an example and the problem/solution is not limited
>>> to NET_RAW capability *only*.
>>>
>>> The easiest fix one can apply here is to lock-down user-namespaces which
>>> many of the distros do (i.e. don't allow users to create user namespaces),
>>> but unfortunately that prevents everyone from using them.
>>>
>>> Approach
>>> --------
>>> Introduce a notion of 'controlled' user-namespaces. Every process on
>>> the host is allowed to create user-namespaces (governed by the limit
>>> imposed by per-ns sysctl) however, mark user-namespaces created by
>>> sandboxed processes as 'controlled'. Use this 'mark' at the time of
>>> capability check in conjunction with a global capability whitelist.
>>> If the capability is not whitelisted, processes that belong to
>>> controlled user-namespaces will not be allowed.
>>>
>>> Once a user-ns is marked as 'controlled'; all its child user-
>>> namespaces are marked as 'controlled' too.
>>>
>>> A global whitelist is list of capabilities governed by the
>>> sysctl which is available to (privileged) user in init-ns to modify
>>> while it's applicable to all controlled user-namespaces on the host.
>>>
>>> Marking user-namespaces controlled without modifying the whitelist is
>>> equivalent of the current behavior. The default value of whitelist includes
>>> all capabilities so that the compatibility is maintained. However it gives
>>> admins fine-grained ability to control various capabilities system wide
>>> without locking down user-namespaces.
>>>
>>> Please see individual patches in this series.
>>>
>>> Mahesh Bandewar (2):
>>>   capability: introduce sysctl for controlled user-ns capability whitelist
>>>   userns: control capabilities of some user namespaces
>>>
>>>  Documentation/sysctl/kernel.txt | 21 +++++++++++++++++
>>>  include/linux/capability.h      |  7 ++++++
>>>  include/linux/user_namespace.h  | 25 ++++++++++++++++++++
>>>  kernel/capability.c             | 52 +++++++++++++++++++++++++++++++++++++++++
>>>  kernel/sysctl.c                 |  5 ++++
>>>  kernel/user_namespace.c         |  4 ++++
>>>  security/commoncap.c            |  8 +++++++
>>>  7 files changed, 122 insertions(+)
>>>
>>> --
>>> 2.15.0.531.g2ccb3012c9-goog
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-api" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux