Hello Eric, On 02/11/2015 02:51 PM, Eric W. Biederman wrote: > "Michael Kerrisk (man-pages)" <mtk.manpages@xxxxxxxxx> writes: > >> Hi Eric, >> >> Ping! >> >> Cheers, >> >> Michael > > My apologies. You description wasn't wrong but it may be a bit > misleading, explanation below. You will have to figure out how to work > that into your proposed text. > >> On 2 February 2015 at 16:36, Michael Kerrisk (man-pages) >> <mtk.manpages@xxxxxxxxx> wrote: >>> [Adding Josh to CC in case he has anything to add.] >>> >>> On 12/12/2014 10:54 PM, Eric W. Biederman wrote: >>>> >>>> Signed-off-by: Eric W. Biederman <ebiederm@xxxxxxxxxxxx> >>>> --- >>>> man5/proc.5 | 15 +++++++++++++++ >>>> 1 file changed, 15 insertions(+) >>>> >>>> diff --git a/man5/proc.5 b/man5/proc.5 >>>> index 96077d0dd195..d661e8cfeac9 100644 >>>> --- a/man5/proc.5 >>>> +++ b/man5/proc.5 >>>> @@ -1097,6 +1097,21 @@ are not available if the main thread has already terminated >>>> .\" Added in 2.6.9 >>>> .\" CONFIG_SCHEDSTATS >>>> .TP >>>> +.IR /proc/[pid]/setgroups " (since Linux 3.19-rc1)" >>>> +This file reports >>>> +.BR allow >>>> +if the setgroups system call is allowed in the current user namespace. >>>> +This file reports >>>> +.BR deny >>>> +if the setgroups system call is not allowed in the current user namespace. >>>> +This file may be written to with values of >>>> +.BR allow >>>> +and >>>> +.BR deny >>>> +before >>>> +.IR /proc/[pid]/gid_map >>>> +is written to (enabling setgroups) in a user namespace. >>>> +.TP >>>> .IR /proc/[pid]/smaps " (since Linux 2.6.14)" >>>> This file shows memory consumption for each of the process's mappings. >>>> (The >>> >>> Hi Eric, >>> >>> Thanks for this patch. I applied it, and then tried to work in >>> quite a few other details gleaned from the source code and commit >>> message, and Jon Corbet's article at http://lwn.net/Articles/626665/. >>> Could you please let me know if the following is correct: > > It is close but it may be misleading. > >>> /proc/[pid]/setgroups (since Linux 3.19) >>> This file displays the string "allow" if processes in >>> the user namespace that contains the process pid are >>> permitted to employ the setgroups(2) system call, and >>> "deny" if setgroups(2) is not permitted in that user >>> namespace. > > With the caveat that when gid_map is not set that setgroups is also not > allowed. Okay -- Iadded that point. >>> A privileged process (one with the CAP_SYS_ADMIN capa‐ >>> bility in the namespace) may write either of the strings >>> "allow" or "deny" to this file before writing a group ID >>> mapping for this user namespace to the file >>> /proc/[pid]/gid_map. Writing the string "deny" prevents >>> any process in the user namespace from employing set‐ >>> groups(2). > > Or more succintly. You are allowed to write to /proc/[pid]/setgroups > when calling setgroups is not allowed because gid_map is unset. This > ensures we do not have any transitions from a state where setgroups > is allowed to a state where setgroups is denied. There are only > transitions from setgroups not-allowed to setgroups allowed. And I've worked in the above point, rewording a bit along the way. So, how does the following look (only the first two paragraphs have changed)? /proc/[pid]/setgroups (since Linux 3.19) This file displays the string "allow" if processes in the user namespace that contains the process pid are permitted to employ the setgroups(2) system call, and "deny" if setgroups(2) is not permitted in that user namespace. (Note, however, that calls to setgroups(2) are also not permitted if /proc/[pid]/gid_map has not yet been set.) A privileged process (one with the CAP_SYS_ADMIN capa‐ bility in the namespace) may write either of the strings "allow" or "deny" to this file before writing a group ID mapping for this user namespace to the file /proc/[pid]/gid_map. Writing the string "deny" prevents any process in the user namespace from employing set‐ groups(2). In other words, it is permitted to write to /proc/[pid]/setgroups so long as calling setgroups(2) is not allowed because /proc/[pid]gid_map has not been set. This ensures that a process cannot transition from a state where setgroups(2) is allowed to a state where setgroups(2) is denied; a process can only trabsition from setgroups(2) being disallowed to setgroups(2) being allowed. The default value of this file in the initial user namespace is "allow". Once /proc/[pid]/gid_map has been written to (which has the effect of enabling setgroups(2) in the user names‐ pace), it is no longer possible to deny setgroups(2) by writing to /proc/[pid]/setgroups. A child user namespace inherits the /proc/[pid]/gid_map setting from its parent. If the setgroups file has the value "deny", then the setgroups(2) system call can't subsequently be reenabled (by writing "allow" to the file) in this user namespace. This restriction also propagates down to all child user namespaces of this user namespace. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers