On Wed, Aug 31, 2022 at 05:24:39PM +0300, Kirill A . Shutemov wrote: > On Sat, Aug 20, 2022 at 10:15:32PM -0700, Hugh Dickins wrote: > > > I will try next week to rework it as shim to top of shmem. Does it work > > > for you? > > > > Yes, please do, thanks. It's a compromise between us: the initial TDX > > case has no justification to use shmem at all, but doing it that way > > will help you with some of the infrastructure, and will probably be > > easiest for KVM to extend to other more relaxed fd cases later. > > Okay, below is my take on the shim approach. > > I don't hate how it turned out. It is easier to understand without > callback exchange thing. > > The only caveat is I had to introduce external lock to protect against > race between lookup and truncate. Otherwise, looks pretty reasonable to me. > > I did very limited testing. And it lacks integration with KVM, but API > changed not substantially, any it should be easy to adopt. I have integrated this patch with other KVM patches and verified the functionality works well in TDX environment with a minor fix below. > > Any comments? > ... > diff --git a/mm/memfd.c b/mm/memfd.c > index 08f5f8304746..1853a90f49ff 100644 > --- a/mm/memfd.c > +++ b/mm/memfd.c > @@ -261,7 +261,8 @@ long memfd_fcntl(struct file *file, unsigned int cmd, unsigned long arg) > #define MFD_NAME_PREFIX_LEN (sizeof(MFD_NAME_PREFIX) - 1) > #define MFD_NAME_MAX_LEN (NAME_MAX - MFD_NAME_PREFIX_LEN) > > -#define MFD_ALL_FLAGS (MFD_CLOEXEC | MFD_ALLOW_SEALING | MFD_HUGETLB) > +#define MFD_ALL_FLAGS (MFD_CLOEXEC | MFD_ALLOW_SEALING | MFD_HUGETLB | \ > + MFD_INACCESSIBLE) > > SYSCALL_DEFINE2(memfd_create, > const char __user *, uname, > @@ -283,6 +284,14 @@ SYSCALL_DEFINE2(memfd_create, > return -EINVAL; > } > > + /* Disallow sealing when MFD_INACCESSIBLE is set. */ > + if ((flags & MFD_INACCESSIBLE) && (flags & MFD_ALLOW_SEALING)) > + return -EINVAL; > + > + /* TODO: add hugetlb support */ > + if ((flags & MFD_INACCESSIBLE) && (flags & MFD_HUGETLB)) > + return -EINVAL; > + > /* length includes terminating zero */ > len = strnlen_user(uname, MFD_NAME_MAX_LEN + 1); > if (len <= 0) > @@ -331,10 +340,24 @@ SYSCALL_DEFINE2(memfd_create, > *file_seals &= ~F_SEAL_SEAL; > } > > + if (flags & MFD_INACCESSIBLE) { > + struct file *inaccessible_file; > + > + inaccessible_file = memfd_mkinaccessible(file); > + if (IS_ERR(inaccessible_file)) { > + error = PTR_ERR(inaccessible_file); > + goto err_file; > + } The new file should alse be marked as O_LARGEFILE otherwise setting the initial size greater than 2^31 on the fd will be refused by ftruncate(). + inaccessible_file->f_flags |= O_LARGEFILE; + > + > + file = inaccessible_file; > + } > + > fd_install(fd, file); > kfree(name); > return fd; > > +err_file: > + fput(file); > err_fd: > put_unused_fd(fd); > err_name: