Re: [Bug 217238] New: Creating shared read-only map is denied after add write seal to a memfd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 24, 2023 at 01:36:46PM -0700, Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Fri, 24 Mar 2023 03:34:23 +0000 bugzilla-daemon@xxxxxxxxxx wrote:
>
> > https://bugzilla.kernel.org/show_bug.cgi?id=217238
> >
> >             Bug ID: 217238
> >            Summary: Creating shared read-only map is denied after add
> >                     write seal to a memfd
> >            Product: Memory Management
> >            Version: 2.5
> >     Kernel Version: 6.2.8
> >           Hardware: All
> >                 OS: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: normal
> >           Priority: P1
> >          Component: Other
> >           Assignee: akpm@xxxxxxxxxxxxxxxxxxxx
> >           Reporter: yshuiv7@xxxxxxxxx
> >         Regression: No
> >
> > Test case:
> >
> >     int main() {
> >       int fd = memfd_create("test", MFD_ALLOW_SEALING);
> >       write(fd, "test", 4);
> >       fcntl(fd, F_ADD_SEALS, F_SEAL_WRITE);
> >
> >       void *ret = mmap(NULL, 4, PROT_READ, MAP_SHARED, fd, 0);
> >     }
> >
> > This fails with EPERM. This is in contradiction with what's described in the
> > documentation of F_SEAL_WRITE.
> >
> > --
> > You may reply to this email to add a comment.
> >
> > You are receiving this mail because:
> > You are the assignee for the bug.
>

This issue seems to be the result of the use of the memfd's shmem region's
page cache object (struct address_space)'s i_mmap_writable field to denote
whether it is write-sealed.

The kernel assumes that a VM_SHARED mapping might become writable at any
time via mprotect(), therefore treats VM_SHARED mappings as if they were
writable as far as i_mmap_writable is concerned (this field's primary use
is to determine whether, for architectures that require it, flushing must
occur if this is set to avoid aliasing, see filemap_read() for example).

In theory we could convert all such checks to VM_SHARED | VM_WRITE
(importantly including on fork) and then update mprotect() to check
mapping_map_writable() if a user tries to make unwritable memory
writable.

I suspect however there are reasons relating to locking that make it
unreasonable to try to do this, but I may be mistaken (others might have
some insight on this). I also see some complexity around this in the
security checks on marking shared writable mappings executable (e.g. in
mmap_violation_check()).

In any case, it doesn't really make much sense to have a write-sealed
shared mapping, since you're essentially saying 'nothing _at all_ can write
to this' so it may as well be private. The semantics are unfortunate here,
the memory will still be shared read-only by MAP_PRIVATE mappings.

A better choice here might be F_SEAL_FUTURE_WRITE (available from kernel
>=5.1) which does permit shared read-only mappings as this is explicitly
checked for in seal_check_future_write() invoked from shmem_mmap().

Regardless, I think the conclusion is that this is not a bug, but rather
that the documentation needs to be updated.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux