Re: [PATCH v2] man2/shmget2: Add details about EPERM error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Michael

It seems we all miss RLIMIT_MEMLOCK.
"this limit instead governs the amount of memory that an unprivileged 
process may lock."

I found this because someone has sent a patch to ltp fix this unexpected 
  error problem.

https://patchwork.ozlabs.org/project/ltp/patch/20210706132114.204443-1-cascardo@xxxxxxxxxxxxx/

Best Regards
Yang Xu
> Hi Michael
> Your patch is more clear, it looks good to me.
>
> Best Regards
> Yang Xu
>> Hello Yang Xu,
>>
>> On 5/12/21 10:53 PM, Yang Xu wrote:
>>> hugetlb_shm_group contains group id that is allowed to create SysV
>>> shared
>>> memory segment using hugetlb page. To meet EPERM error, we also
>>> need to make group id be not in this proc file.
>>>
>>> Signed-off-by: Yang Xu<xuyang2018.jy@xxxxxxxxxxx>
>>> ---
>>> man2/shmget.2 | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/man2/shmget.2 b/man2/shmget.2
>>> index 757b7b7f1..29799b9b8 100644
>>> --- a/man2/shmget.2
>>> +++ b/man2/shmget.2
>>> @@ -273,7 +273,7 @@ The
>>> .B SHM_HUGETLB
>>> flag was specified, but the caller was not privileged (did not have the
>>> .B CAP_IPC_LOCK
>>> -capability).
>>> +capability and group id doesn't be contained in hugetlb_shm_group
>>> proc file).
>>> .SH CONFORMING TO
>>> POSIX.1-2001, POSIX.1-2008, SVr4.
>>> .\" SVr4 documents an additional error condition EEXIST.
>>
>> Thanks for spotting this. The story is more complex, as far as I can
>> tell. For example, the same error also occurs for mmap(2) and
>> memfd_create(2)
>>
>> Instead of your patch, I applied the diff below (not yet pushed),
>> based on my reading of fs/hugetlbfs/inode.c, in particular:
>>
>> static int can_do_hugetlb_shm(void)
>> {
>> kgid_t shm_group;
>> shm_group = make_kgid(&init_user_ns, sysctl_hugetlb_shm_group);
>> return capable(CAP_IPC_LOCK) || in_group_p(shm_group);
>> }
>>
>> ...
>>
>> struct file *hugetlb_file_setup(const char *name, size_t size,
>> vm_flags_t acctflag, struct user_struct **user,
>> int creat_flags, int page_size_log)
>> {
>> ...
>> if (creat_flags == HUGETLB_SHMFS_INODE&& !can_do_hugetlb_shm()) {
>> *user = current_user();
>> if (user_shm_lock(size, *user)) {
>> task_lock(current);
>> pr_warn_once("%s (%d): Using mlock ulimits for SHM_HUGETLB is
>> deprecated\n",
>> current->comm, current->pid);
>> task_unlock(current);
>> } else {
>> *user = NULL;
>> return ERR_PTR(-EPERM);
>> }
>> }
>> ...
>> }
>>
>> As a deprecated feature, it appears that the RLIMIT_MEMLOCK
>> can also be used to permit huge page allocation, but I have
>> chose not to document that for now.
>>
>> Please let me know if the patch makes sense to you.
>>
>> With best regards,
>>
>> Michael
>>
>> --- a/man2/memfd_create.2
>> +++ b/man2/memfd_create.2
>> @@ -201,6 +201,19 @@ The
>> .BR memfd_create ()
>> system call first appeared in Linux 3.17;
>> glibc support was added in version 2.27.
>> +.TP
>> +.B EPERM
>> +The
>> +.B MFD_HUGETLB
>> +flag was specified, but the caller was not privileged (did not have the
>> +.B CAP_IPC_LOCK
>> +capability)
>> +and is not a member of the
>> +.I sysctl_hugetlb_shm_group
>> +group; see the description of
>> +.I /proc/sys/vm/sysctl_hugetlb_shm_group
>> +in
>> +.BR proc (5).
>> .SH CONFORMING TO
>> The
>> .BR memfd_create ()
>> diff --git a/man2/mmap.2 b/man2/mmap.2
>> index 03f2eeb2c..4ee2f4f96 100644
>> --- a/man2/mmap.2
>> +++ b/man2/mmap.2
>> @@ -628,6 +628,18 @@ was mounted no-exec.
>> The operation was prevented by a file seal; see
>> .BR fcntl (2).
>> .TP
>> +.B EPERM
>> +The
>> +.B MAP_HUGETLB
>> +flag was specified, but the caller was not privileged (did not have the
>> +.B CAP_IPC_LOCK
>> +capability)
>> +and is not a member of the
>> +.I sysctl_hugetlb_shm_group
>> +group; see the description of
>> +.I /proc/sys/vm/sysctl_hugetlb_shm_group
>> +in
>> +.TP
>> .B ETXTBSY
>> .B MAP_DENYWRITE
>> was set but the object specified by
>> diff --git a/man2/shmget.2 b/man2/shmget.2
>> index 757b7b7f1..6e9995e81 100644
>> --- a/man2/shmget.2
>> +++ b/man2/shmget.2
>> @@ -273,7 +273,13 @@ The
>> .B SHM_HUGETLB
>> flag was specified, but the caller was not privileged (did not have the
>> .B CAP_IPC_LOCK
>> -capability).
>> +capability)
>> +and is not a member of the
>> +.I sysctl_hugetlb_shm_group
>> +group; see the description of
>> +.I /proc/sys/vm/sysctl_hugetlb_shm_group
>> +in
>> +.BR proc (5).
>> .SH CONFORMING TO
>> POSIX.1-2001, POSIX.1-2008, SVr4.
>> .\" SVr4 documents an additional error condition EEXIST.
>> diff --git a/man5/proc.5 b/man5/proc.5
>> index a28dbdcc7..888535449 100644
>> --- a/man5/proc.5
>> +++ b/man5/proc.5
>> @@ -5603,6 +5603,19 @@ user should run
>> .BR sync (1)
>> first.
>> .TP
>> +.IR /proc/sys/vm/sysctl_hugetlb_shm_group " (since Linux 2.6.7)"
>> +This writable file contains a group ID that is allowed
>> +to allocate memory using huge pages.
>> +If a process has a filesystem group ID or any supplememtary group ID
>> that
>> +matches this group ID,
>> +then it can make huge-page allocations without holding the
>> +.BR CAP_IPC_LOCK
>> +capability; see
>> +.BR memfd_create (2),
>> +.BR mmap (2),
>> +and
>> +.BR shmget (2).
>> +.TP
>> .IR /proc/sys/vm/legacy_va_layout " (since Linux 2.6.9)"
>> .\" The following is from Documentation/filesystems/proc.txt
>> If nonzero, this disables the new 32-bit memory-mapping layout;
>> diff --git a/man7/capabilities.7 b/man7/capabilities.7
>> index 7e79b2fb6..cf9dc190f 100644
>> --- a/man7/capabilities.7
>> +++ b/man7/capabilities.7
>> @@ -205,11 +205,21 @@ the filesystem or any of the supplementary GIDs
>> of the calling process.
>> .B CAP_IPC_LOCK
>> .\" FIXME . As at Linux 3.2, there are some strange uses of this
>> capability
>> .\" in other places; they probably should be replaced with something
>> else.
>> +.PD 0
>> +.RS
>> +.IP * 2
>> Lock memory
>> .RB ( mlock (2),
>> .BR mlockall (2),
>> .BR mmap (2),
>> +.BR shmctl (2));
>> +.IP *
>> +Allocate memory using huge pages
>> +.RB ( memfd_create (2)
>> +.BR mmap (2),
>> .BR shmctl (2)).
>> +.PD 0
>> +.RE
>> .TP
>> .B CAP_IPC_OWNER
>> Bypass permission checks for operations on System V IPC objects.
>> $
>>
>>
>>
>




[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux