On Wed, Nov 21, 2018 at 6:27 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Tue, 20 Nov 2018 13:13:35 -0800 Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote: > > > > > I am Ok with whatever Andrew wants to do, if it is better to squash it with > > > > the original, then I can do that and send another patch. > > > > > > > > > > > > > > From experience, Andrew will food in fixups on request :) > > > > Andrew, could you squash this patch into the one titled ("mm: Add an > > F_SEAL_FUTURE_WRITE seal to memfd")? > > Sure. > > I could of course queue them separately but I rarely do so - I don't > think that the intermediate development states are useful in the > infinite-term, and I make them available via additional Link: tags in > the changelog footers anyway. > > I think that the magnitude of these patches is such that John Stultz's > Reviewed-by is invalidated, so this series is now in the "unreviewed" > state. > > So can we have a re-review please? For convenience, here's the > folded-together [1/1] patch, as it will go to Linus. > > > From: "Joel Fernandes (Google)" <joel@xxxxxxxxxxxxxxxxx> > Subject: mm: Add an F_SEAL_FUTURE_WRITE seal to memfd > > Android uses ashmem for sharing memory regions. We are looking forward to > migrating all usecases of ashmem to memfd so that we can possibly remove > the ashmem driver in the future from staging while also benefiting from > using memfd and contributing to it. Note staging drivers are also not ABI > and generally can be removed at anytime. > > One of the main usecases Android has is the ability to create a region and > mmap it as writeable, then add protection against making any "future" > writes while keeping the existing already mmap'ed writeable-region active. > This allows us to implement a usecase where receivers of the shared > memory buffer can get a read-only view, while the sender continues to > write to the buffer. See CursorWindow documentation in Android for more > details: > https://developer.android.com/reference/android/database/CursorWindow > > This usecase cannot be implemented with the existing F_SEAL_WRITE seal. > To support the usecase, this patch adds a new F_SEAL_FUTURE_WRITE seal > which prevents any future mmap and write syscalls from succeeding while > keeping the existing mmap active. The following program shows the seal > working in action: > > #include <stdio.h> > #include <errno.h> > #include <sys/mman.h> > #include <linux/memfd.h> > #include <linux/fcntl.h> > #include <asm/unistd.h> > #include <unistd.h> > #define F_SEAL_FUTURE_WRITE 0x0010 > #define REGION_SIZE (5 * 1024 * 1024) > > int memfd_create_region(const char *name, size_t size) > { > int ret; > int fd = syscall(__NR_memfd_create, name, MFD_ALLOW_SEALING); > if (fd < 0) return fd; > ret = ftruncate(fd, size); > if (ret < 0) { close(fd); return ret; } > return fd; > } > > int main() { > int ret, fd; > void *addr, *addr2, *addr3, *addr1; > ret = memfd_create_region("test_region", REGION_SIZE); > printf("ret=%d\n", ret); > fd = ret; > > // Create map > addr = mmap(0, REGION_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); > if (addr == MAP_FAILED) > printf("map 0 failed\n"); > else > printf("map 0 passed\n"); > > if ((ret = write(fd, "test", 4)) != 4) > printf("write failed even though no future-write seal " > "(ret=%d errno =%d)\n", ret, errno); > else > printf("write passed\n"); > > addr1 = mmap(0, REGION_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); > if (addr1 == MAP_FAILED) > perror("map 1 prot-write failed even though no seal\n"); > else > printf("map 1 prot-write passed as expected\n"); > > ret = fcntl(fd, F_ADD_SEALS, F_SEAL_FUTURE_WRITE | > F_SEAL_GROW | > F_SEAL_SHRINK); > if (ret == -1) > printf("fcntl failed, errno: %d\n", errno); > else > printf("future-write seal now active\n"); > > if ((ret = write(fd, "test", 4)) != 4) > printf("write failed as expected due to future-write seal\n"); > else > printf("write passed (unexpected)\n"); > > addr2 = mmap(0, REGION_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); > if (addr2 == MAP_FAILED) > perror("map 2 prot-write failed as expected due to seal\n"); > else > printf("map 2 passed\n"); > > addr3 = mmap(0, REGION_SIZE, PROT_READ, MAP_SHARED, fd, 0); > if (addr3 == MAP_FAILED) > perror("map 3 failed\n"); > else > printf("map 3 prot-read passed as expected\n"); > } > > The output of running this program is as follows: > ret=3 > map 0 passed > write passed > map 1 prot-write passed as expected > future-write seal now active > write failed as expected due to future-write seal > map 2 prot-write failed as expected due to seal > : Permission denied > map 3 prot-read passed as expected > > [joel@xxxxxxxxxxxxxxxxx: make F_SEAL_FUTURE_WRITE seal more robust] > Link: http://lkml.kernel.org/r/20181120052137.74317-1-joel@xxxxxxxxxxxxxxxxx > Link: http://lkml.kernel.org/r/20181108041537.39694-1-joel@xxxxxxxxxxxxxxxxx > Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> > Cc: John Stultz <john.stultz@xxxxxxxxxx> > Cc: John Reck <jreck@xxxxxxxxxx> > Cc: Todd Kjos <tkjos@xxxxxxxxxx> > Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> > Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx> > Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx> > Cc: Daniel Colascione <dancol@xxxxxxxxxx> > Cc: J. Bruce Fields <bfields@xxxxxxxxxxxx> > Cc: Jeff Layton <jlayton@xxxxxxxxxx> > Cc: Khalid Aziz <khalid.aziz@xxxxxxxxxx> > Cc: Lei Yang <Lei.Yang@xxxxxxxxxxxxx> > Cc: Marc-Andr Lureau <marcandre.lureau@xxxxxxxxxx> > Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx> > Cc: Minchan Kim <minchan@xxxxxxxxxx> > Cc: Shuah Khan <shuah@xxxxxxxxxx> > Cc: Valdis Kletnieks <valdis.kletnieks@xxxxxx> > Cc: Andy Lutomirski <luto@xxxxxxxxxx> > Cc: Jann Horn <jannh@xxxxxxxxxx> > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > --- > > > --- a/include/uapi/linux/fcntl.h~mm-add-an-f_seal_future_write-seal-to-memfd > +++ a/include/uapi/linux/fcntl.h > @@ -41,6 +41,7 @@ > #define F_SEAL_SHRINK 0x0002 /* prevent file from shrinking */ > #define F_SEAL_GROW 0x0004 /* prevent file from growing */ > #define F_SEAL_WRITE 0x0008 /* prevent writes */ > +#define F_SEAL_FUTURE_WRITE 0x0010 /* prevent future writes while mapped */ > /* (1U << 31) is reserved for signed error codes */ > > /* > --- a/mm/memfd.c~mm-add-an-f_seal_future_write-seal-to-memfd > +++ a/mm/memfd.c > @@ -131,7 +131,8 @@ static unsigned int *memfd_file_seals_pt > #define F_ALL_SEALS (F_SEAL_SEAL | \ > F_SEAL_SHRINK | \ > F_SEAL_GROW | \ > - F_SEAL_WRITE) > + F_SEAL_WRITE | \ > + F_SEAL_FUTURE_WRITE) > > static int memfd_add_seals(struct file *file, unsigned int seals) > { > --- a/fs/hugetlbfs/inode.c~mm-add-an-f_seal_future_write-seal-to-memfd > +++ a/fs/hugetlbfs/inode.c > @@ -530,7 +530,7 @@ static long hugetlbfs_punch_hole(struct > inode_lock(inode); > > /* protected by i_mutex */ > - if (info->seals & F_SEAL_WRITE) { > + if (info->seals & (F_SEAL_WRITE | F_SEAL_FUTURE_WRITE)) { > inode_unlock(inode); > return -EPERM; > } > --- a/mm/shmem.c~mm-add-an-f_seal_future_write-seal-to-memfd > +++ a/mm/shmem.c > @@ -2119,6 +2119,23 @@ out_nomem: > > static int shmem_mmap(struct file *file, struct vm_area_struct *vma) > { > + struct shmem_inode_info *info = SHMEM_I(file_inode(file)); > + > + /* > + * New PROT_READ and MAP_SHARED mmaps are not allowed when "future PROT_WRITE, perhaps? > + * write" seal active. > + */ > + if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_WRITE) && > + (info->seals & F_SEAL_FUTURE_WRITE)) > + return -EPERM; > + > + /* > + * Since the F_SEAL_FUTURE_WRITE seals allow for a MAP_SHARED read-only > + * mapping, take care to not allow mprotect to revert protections. > + */ > + if (info->seals & F_SEAL_FUTURE_WRITE) > + vma->vm_flags &= ~(VM_MAYWRITE); > + This might all be clearer as: if (info->seals & F_SEAL_FUTURE_WRITE) { if (vma->vm_flags ...) return -EPERM; vma->vm_flags &= ~VM_MAYWRITE; } with appropriate comments inserted.