Re: [RFC PATCH 0/3] permit write-sealed memfd read-only shared mappings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 21, 2023 at 11:01:26AM +0200, Jan Kara wrote:
> Hi!
>
> On Mon 03-04-23 23:28:29, Lorenzo Stoakes wrote:
> > This patch series is in two parts:-
> >
> > 1. Currently there are a number of places in the kernel where we assume
> >    VM_SHARED implies that a mapping is writable. Let's be slightly less
> >    strict and relax this restriction in the case that VM_MAYWRITE is not
> >    set.
> >
> >    This should have no noticeable impact as the lack of VM_MAYWRITE implies
> >    that the mapping can not be made writable via mprotect() or any other
> >    means.
> >
> > 2. Align the behaviour of F_SEAL_WRITE and F_SEAL_FUTURE_WRITE on mmap().
> >    The latter already clears the VM_MAYWRITE flag for a sealed read-only
> >    mapping, we simply extend this to F_SEAL_WRITE too.
> >
> >    For this to have effect, we must also invoke call_mmap() before
> >    mapping_map_writable().
> >
> > As this is quite a fundamental change on the assumptions around VM_SHARED
> > and since this causes a visible change to userland (in permitting read-only
> > shared mappings on F_SEAL_WRITE mappings), I am putting forward as an RFC
> > to see if there is anything terribly wrong with it.
>
> So what I miss in this series is what the motivation is. Is it that you need
> to map F_SEAL_WRITE read-only? Why?
>

This originated from the discussion in [1], which refers to the bug
reported in [2]. Essentially the user is write-sealing a memfd then trying
to mmap it read-only, but receives an -EPERM error.

F_SEAL_FUTURE_WRITE _does_ explicitly permit this but F_SEAL_WRITE does not.

The fcntl() man page states:

    Furthermore, trying to create new shared, writable memory-mappings via
    mmap(2) will also fail with EPERM.

So the kernel does not behave as the documentation states.

I took the user-supplied repro and slightly modified it, enclosed
below. After this patch series, this code works correctly.

I think there's definitely a case for the VM_MAYWRITE part of this patch
series even if the memfd bits are not considered useful, as we do seem to
make the implicit assumption that MAP_SHARED == writable even if
!VM_MAYWRITE which seems odd.

Reproducer:-

int main()
{
       int fd = memfd_create("test", MFD_ALLOW_SEALING);
       if (fd == -1) {
	       perror("memfd_create");
	       return EXIT_FAILURE;
       }

       write(fd, "test", 4);

       if (fcntl(fd, F_ADD_SEALS, F_SEAL_WRITE) == -1) {
	       perror("fcntl");
	       return EXIT_FAILURE;
       }

       void *ret = mmap(NULL, 4, PROT_READ, MAP_SHARED, fd, 0);
       if (ret == MAP_FAILED) {
	       perror("mmap");
	       return EXIT_FAILURE;
       }

       return EXIT_SUCCESS;
}

[1]:https://lore.kernel.org/all/20230324133646.16101dfa666f253c4715d965@xxxxxxxxxxxxxxxxxxxx/
[2]:https://bugzilla.kernel.org/show_bug.cgi?id=217238

> 								Honza
> --
> Jan Kara <jack@xxxxxxxx>
> SUSE Labs, CR



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux