Hi On Fri, Jun 13, 2014 at 12:36 PM, David Herrmann <dh.herrmann@xxxxxxxxx> wrote: > Hi > > This is v3 of the File-Sealing and memfd_create() patches. You can find v1 with > a longer introduction at gmane: > http://thread.gmane.org/gmane.comp.video.dri.devel/102241 > An LWN article about memfd+sealing is available, too: > https://lwn.net/Articles/593918/ > v2 with some more discussions can be found here: > http://thread.gmane.org/gmane.linux.kernel.mm/115713 > > This series introduces two new APIs: > memfd_create(): Think of this syscall as malloc() but it returns a > file-descriptor instead of a pointer. That file-descriptor is > backed by anon-memory and can be memory-mapped for access. > sealing: The sealing API can be used to prevent a specific set of operations > on a file-descriptor. You 'seal' the file and give thus the > guarantee, that it cannot be modified in the specific ways. > > A short high-level introduction is also available here: > http://dvdhrm.wordpress.com/2014/06/10/memfd_create2/ > > > Changed in v3: > - fcntl() now returns EINVAL if the FD does not support sealing. We used to > return EBADF like pipe_fcntl() does, but that is really weird and I don't > like repeating that. > - seals are now saved as "unsigned int" instead of "u32". > - i_mmap_writable is now an atomic so we can deny writable mappings just like > i_writecount does. > - SHMEM_ALLOW_SEALING is dropped. We initialize all objects with F_SEAL_SEAL > and only unset it for memfds that shall support sealing. > - memfd_create() no longer has a size argument. It was redundant, use > ftruncate() or fallocate(). > - memfd_create() flags are "unsigned int" now, instead of "u64". > - NAME_MAX off-by-one fix > - several cosmetic changes > - Added AIO/Direct-IO page-pinning protection > > The last point is the most important change in this version: We now bail out if > any page-refcount is elevated while setting SEAL_WRITE. This prevents parallel > GUP users from writing to sealed files _after_ they were sealed. There is also a > new FUSE-based test-case to trigger such situations. > > The last 2 patches try to improve the page-pinning handling. I included both in > this series, but obviously only one of them is needed (or we could stack them): > - 6/7: This waits for up to 150ms for pages to be unpinned > - 7/7: This isolates pinned pages and replaces them with a fresh copy > > Hugh, patch 6 is basically your code. In case that gets merged, can I put your > Signed-off-by on it? Hugh, any comments on patch 5, 6 and 7? Those are the last outstanding issues with memfd+sealing. Patch 7 (isolating pages) is still my favorite and has been running just fine on my machine for the last months. I think it'd be nice if we could give it a try in -next. We can always fall back to Patch 5 or Patch 5+6. Those will detect any racing AIO and just fail or wait for the IO to finish for a short period. Are there any other blockers for this? Thanks David -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html