On Thu, Dec 01, 2022 at 06:16:46PM -0800, Vishal Annapurve wrote: > On Tue, Oct 25, 2022 at 8:18 AM Chao Peng <chao.p.peng@xxxxxxxxxxxxxxx> wrote: > > ... > > +} > > + > > +SYSCALL_DEFINE1(memfd_restricted, unsigned int, flags) > > +{ > > Looking at the underlying shmem implementation, there seems to be no > way to enable transparent huge pages specifically for restricted memfd > files. > > Michael discussed earlier about tweaking > /sys/kernel/mm/transparent_hugepage/shmem_enabled setting to allow > hugepages to be used while backing restricted memfd. Such a change > will affect the rest of the shmem usecases as well. Even setting the > shmem_enabled policy to "advise" wouldn't help unless file based > advise for hugepage allocation is implemented. Had a look at fadvise() and looks it does not support HUGEPAGE for any filesystem yet. > > Does it make sense to provide a flag here to allow creating restricted > memfds backed possibly by huge pages to give a more granular control? We do have a unused 'flags' can be extended for such usage, but I would let Kirill have further look, perhaps need more discussions. Chao > > > + struct file *file, *restricted_file; > > + int fd, err; > > + > > + if (flags) > > + return -EINVAL; > > + > > + fd = get_unused_fd_flags(0); > > + if (fd < 0) > > + return fd; > > + > > + file = shmem_file_setup("memfd:restrictedmem", 0, VM_NORESERVE); > > + if (IS_ERR(file)) { > > + err = PTR_ERR(file); > > + goto err_fd; > > + } > > + file->f_mode |= FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE; > > + file->f_flags |= O_LARGEFILE; > > + > > + restricted_file = restrictedmem_file_create(file); > > + if (IS_ERR(restricted_file)) { > > + err = PTR_ERR(restricted_file); > > + fput(file); > > + goto err_fd; > > + } > > + > > + fd_install(fd, restricted_file); > > + return fd; > > +err_fd: > > + put_unused_fd(fd); > > + return err; > > +} > > + > > +void restrictedmem_register_notifier(struct file *file, > > + struct restrictedmem_notifier *notifier) > > +{ > > + struct restrictedmem_data *data = file->f_mapping->private_data; > > + > > + mutex_lock(&data->lock); > > + list_add(¬ifier->list, &data->notifiers); > > + mutex_unlock(&data->lock); > > +} > > +EXPORT_SYMBOL_GPL(restrictedmem_register_notifier); > > + > > +void restrictedmem_unregister_notifier(struct file *file, > > + struct restrictedmem_notifier *notifier) > > +{ > > + struct restrictedmem_data *data = file->f_mapping->private_data; > > + > > + mutex_lock(&data->lock); > > + list_del(¬ifier->list); > > + mutex_unlock(&data->lock); > > +} > > +EXPORT_SYMBOL_GPL(restrictedmem_unregister_notifier); > > + > > +int restrictedmem_get_page(struct file *file, pgoff_t offset, > > + struct page **pagep, int *order) > > +{ > > + struct restrictedmem_data *data = file->f_mapping->private_data; > > + struct file *memfd = data->memfd; > > + struct page *page; > > + int ret; > > + > > + ret = shmem_getpage(file_inode(memfd), offset, &page, SGP_WRITE); > > + if (ret) > > + return ret; > > + > > + *pagep = page; > > + if (order) > > + *order = thp_order(compound_head(page)); > > + > > + SetPageUptodate(page); > > + unlock_page(page); > > + > > + return 0; > > +} > > +EXPORT_SYMBOL_GPL(restrictedmem_get_page); > > -- > > 2.25.1 > >