Re: [PATCH v2] man2/shmget2: Add details about EPERM error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Michael
Your patch is more clear, it looks good to me.

Best Regards
Yang Xu
> Hello Yang Xu,
>
> On 5/12/21 10:53 PM, Yang Xu wrote:
>> hugetlb_shm_group contains group id that is allowed to create SysV shared
>> memory segment using hugetlb page. To meet EPERM error, we also
>> need to make group id be not in this proc file.
>>
>> Signed-off-by: Yang Xu<xuyang2018.jy@xxxxxxxxxxx>
>> ---
>>   man2/shmget.2 | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/man2/shmget.2 b/man2/shmget.2
>> index 757b7b7f1..29799b9b8 100644
>> --- a/man2/shmget.2
>> +++ b/man2/shmget.2
>> @@ -273,7 +273,7 @@ The
>>   .B SHM_HUGETLB
>>   flag was specified, but the caller was not privileged (did not have the
>>   .B CAP_IPC_LOCK
>> -capability).
>> +capability and group id doesn't be contained in hugetlb_shm_group proc file).
>>   .SH CONFORMING TO
>>   POSIX.1-2001, POSIX.1-2008, SVr4.
>>   .\" SVr4 documents an additional error condition EEXIST.
>
> Thanks for spotting this. The story is more complex, as far as I can
> tell. For example, the same error also occurs for mmap(2) and
> memfd_create(2)
>
> Instead of your patch, I applied the diff below (not yet pushed),
> based on my reading of fs/hugetlbfs/inode.c, in particular:
>
>      static int can_do_hugetlb_shm(void)
>      {
>              kgid_t shm_group;
>              shm_group = make_kgid(&init_user_ns, sysctl_hugetlb_shm_group);
>              return capable(CAP_IPC_LOCK) || in_group_p(shm_group);
>      }
>
>      ...
>
>      struct file *hugetlb_file_setup(const char *name, size_t size,
>                                      vm_flags_t acctflag, struct user_struct **user,
>                                      int creat_flags, int page_size_log)
>      {
>              ...
>              if (creat_flags == HUGETLB_SHMFS_INODE&&  !can_do_hugetlb_shm()) {
>                      *user = current_user();
>                      if (user_shm_lock(size, *user)) {
>                              task_lock(current);
>                              pr_warn_once("%s (%d): Using mlock ulimits for SHM_HUGETLB is deprecated\n",
>                                      current->comm, current->pid);
>                              task_unlock(current);
>                      } else {
>                              *user = NULL;
>                              return ERR_PTR(-EPERM);
>                      }
>              }
>              ...
>      }
>
> As a deprecated feature, it appears that the RLIMIT_MEMLOCK
> can also be used to permit huge page allocation, but I have
> chose not to document that for now.
>
> Please let me know if the patch makes sense to you.
>
> With best regards,
>
> Michael
>
> --- a/man2/memfd_create.2
> +++ b/man2/memfd_create.2
> @@ -201,6 +201,19 @@ The
>   .BR memfd_create ()
>   system call first appeared in Linux 3.17;
>   glibc support was added in version 2.27.
> +.TP
> +.B EPERM
> +The
> +.B MFD_HUGETLB
> +flag was specified, but the caller was not privileged (did not have the
> +.B CAP_IPC_LOCK
> +capability)
> +and is not a member of the
> +.I sysctl_hugetlb_shm_group
> +group; see the description of
> +.I /proc/sys/vm/sysctl_hugetlb_shm_group
> +in
> +.BR proc (5).
>   .SH CONFORMING TO
>   The
>   .BR memfd_create ()
> diff --git a/man2/mmap.2 b/man2/mmap.2
> index 03f2eeb2c..4ee2f4f96 100644
> --- a/man2/mmap.2
> +++ b/man2/mmap.2
> @@ -628,6 +628,18 @@ was mounted no-exec.
>   The operation was prevented by a file seal; see
>   .BR fcntl (2).
>   .TP
> +.B EPERM
> +The
> +.B MAP_HUGETLB
> +flag was specified, but the caller was not privileged (did not have the
> +.B CAP_IPC_LOCK
> +capability)
> +and is not a member of the
> +.I sysctl_hugetlb_shm_group
> +group; see the description of
> +.I /proc/sys/vm/sysctl_hugetlb_shm_group
> +in
> +.TP
>   .B ETXTBSY
>   .B MAP_DENYWRITE
>   was set but the object specified by
> diff --git a/man2/shmget.2 b/man2/shmget.2
> index 757b7b7f1..6e9995e81 100644
> --- a/man2/shmget.2
> +++ b/man2/shmget.2
> @@ -273,7 +273,13 @@ The
>   .B SHM_HUGETLB
>   flag was specified, but the caller was not privileged (did not have the
>   .B CAP_IPC_LOCK
> -capability).
> +capability)
> +and is not a member of the
> +.I sysctl_hugetlb_shm_group
> +group; see the description of
> +.I /proc/sys/vm/sysctl_hugetlb_shm_group
> +in
> +.BR proc (5).
>   .SH CONFORMING TO
>   POSIX.1-2001, POSIX.1-2008, SVr4.
>   .\" SVr4 documents an additional error condition EEXIST.
> diff --git a/man5/proc.5 b/man5/proc.5
> index a28dbdcc7..888535449 100644
> --- a/man5/proc.5
> +++ b/man5/proc.5
> @@ -5603,6 +5603,19 @@ user should run
>   .BR sync (1)
>   first.
>   .TP
> +.IR  /proc/sys/vm/sysctl_hugetlb_shm_group " (since Linux 2.6.7)"
> +This writable file contains a group ID that is allowed
> +to allocate memory using huge pages.
> +If a process has a filesystem group ID or any supplememtary group ID that
> +matches this group ID,
> +then it can make huge-page allocations without holding the
> +.BR CAP_IPC_LOCK
> +capability; see
> +.BR memfd_create (2),
> +.BR mmap (2),
> +and
> +.BR shmget (2).
> +.TP
>   .IR /proc/sys/vm/legacy_va_layout " (since Linux 2.6.9)"
>   .\" The following is from Documentation/filesystems/proc.txt
>   If nonzero, this disables the new 32-bit memory-mapping layout;
> diff --git a/man7/capabilities.7 b/man7/capabilities.7
> index 7e79b2fb6..cf9dc190f 100644
> --- a/man7/capabilities.7
> +++ b/man7/capabilities.7
> @@ -205,11 +205,21 @@ the filesystem or any of the supplementary GIDs of the calling process.
>   .B CAP_IPC_LOCK
>   .\" FIXME . As at Linux 3.2, there are some strange uses of this capability
>   .\" in other places; they probably should be replaced with something else.
> +.PD 0
> +.RS
> +.IP * 2
>   Lock memory
>   .RB ( mlock (2),
>   .BR mlockall (2),
>   .BR mmap (2),
> +.BR shmctl (2));
> +.IP *
> +Allocate memory using huge pages
> +.RB ( memfd_create (2)
> +.BR mmap (2),
>   .BR shmctl (2)).
> +.PD 0
> +.RE
>   .TP
>   .B CAP_IPC_OWNER
>   Bypass permission checks for operations on System V IPC objects.
> $
>
>
>




[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux