Re: [Bug 217238] New: Creating shared read-only map is denied after add write seal to a memfd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


> On Mar 30, 2023, at 12:25 PM, Lorenzo Stoakes <lstoakes@xxxxxxxxx> wrote:
> On Sat, Mar 25, 2023 at 02:51:05PM +0000, Lorenzo Stoakes wrote:
>>> On Fri, Mar 24, 2023 at 01:36:46PM -0700, Andrew Morton wrote:
>>> (switched to email.  Please respond via emailed reply-to-all, not via the
>>> bugzilla web interface).
>>>> On Fri, 24 Mar 2023 03:34:23 +0000 bugzilla-daemon@xxxxxxxxxx wrote:
>>>>            Bug ID: 217238
>>>>           Summary: Creating shared read-only map is denied after add
>>>>                    write seal to a memfd
>>>>           Product: Memory Management
>>>>           Version: 2.5
>>>>    Kernel Version: 6.2.8
>>>>          Hardware: All
>>>>                OS: Linux
>>>>              Tree: Mainline
>>>>            Status: NEW
>>>>          Severity: normal
>>>>          Priority: P1
>>>>         Component: Other
>>>>          Assignee: akpm@xxxxxxxxxxxxxxxxxxxx
>>>>          Reporter: yshuiv7@xxxxxxxxx
>>>>        Regression: No
>>>> Test case:
>>>>    int main() {
>>>>      int fd = memfd_create("test", MFD_ALLOW_SEALING);
>>>>      write(fd, "test", 4);
>>>>      fcntl(fd, F_ADD_SEALS, F_SEAL_WRITE);
>>>>      void *ret = mmap(NULL, 4, PROT_READ, MAP_SHARED, fd, 0);
>>>>    }
>>>> This fails with EPERM. This is in contradiction with what's described in the
>>>> documentation of F_SEAL_WRITE.
>>>> --
>>>> You may reply to this email to add a comment.
>>>> You are receiving this mail because:
>>>> You are the assignee for the bug.
>> This issue seems to be the result of the use of the memfd's shmem region's
>> page cache object (struct address_space)'s i_mmap_writable field to denote
>> whether it is write-sealed.
>> The kernel assumes that a VM_SHARED mapping might become writable at any
>> time via mprotect(), therefore treats VM_SHARED mappings as if they were
>> writable as far as i_mmap_writable is concerned (this field's primary use
>> is to determine whether, for architectures that require it, flushing must
>> occur if this is set to avoid aliasing, see filemap_read() for example).
>> In theory we could convert all such checks to VM_SHARED | VM_WRITE
>> (importantly including on fork) and then update mprotect() to check
>> mapping_map_writable() if a user tries to make unwritable memory
>> writable.

Unless I’m missing something, we have VM_MAYWRITE for almost exactly this purpose.  Can’t we just make a shared mapping with both of these bits clear?

>> I suspect however there are reasons relating to locking that make it
>> unreasonable to try to do this, but I may be mistaken (others might have
>> some insight on this). I also see some complexity around this in the
>> security checks on marking shared writable mappings executable (e.g. in
>> mmap_violation_check()).
>> In any case, it doesn't really make much sense to have a write-sealed
>> shared mapping, since you're essentially saying 'nothing _at all_ can write
>> to this' so it may as well be private. The semantics are unfortunate here,
>> the memory will still be shared read-only by MAP_PRIVATE mappings.
>> A better choice here might be F_SEAL_FUTURE_WRITE (available from kernel
>>> =5.1) which does permit shared read-only mappings as this is explicitly
>> checked for in seal_check_future_write() invoked from shmem_mmap().
>> Regardless, I think the conclusion is that this is not a bug, but rather
>> that the documentation needs to be updated.
> Adding docs people to cc list (sorry didn't think to do this in first
> reply).

[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux