Hi On Fri, Sep 30, 2022 at 5:23 PM Kirill A . Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> wrote: > > On Fri, Sep 30, 2022 at 05:14:00PM +0100, Fuad Tabba wrote: > > Hi, > > > > <...> > > > > > diff --git a/mm/memfd_inaccessible.c b/mm/memfd_inaccessible.c > > > new file mode 100644 > > > index 000000000000..2d33cbdd9282 > > > --- /dev/null > > > +++ b/mm/memfd_inaccessible.c > > > @@ -0,0 +1,219 @@ > > > +// SPDX-License-Identifier: GPL-2.0 > > > +#include "linux/sbitmap.h" > > > +#include <linux/memfd.h> > > > +#include <linux/pagemap.h> > > > +#include <linux/pseudo_fs.h> > > > +#include <linux/shmem_fs.h> > > > +#include <uapi/linux/falloc.h> > > > +#include <uapi/linux/magic.h> > > > + > > > +struct inaccessible_data { > > > + struct mutex lock; > > > + struct file *memfd; > > > + struct list_head notifiers; > > > +}; > > > + > > > +static void inaccessible_notifier_invalidate(struct inaccessible_data *data, > > > + pgoff_t start, pgoff_t end) > > > +{ > > > + struct inaccessible_notifier *notifier; > > > + > > > + mutex_lock(&data->lock); > > > + list_for_each_entry(notifier, &data->notifiers, list) { > > > + notifier->ops->invalidate(notifier, start, end); > > > + } > > > + mutex_unlock(&data->lock); > > > +} > > > + > > > +static int inaccessible_release(struct inode *inode, struct file *file) > > > +{ > > > + struct inaccessible_data *data = inode->i_mapping->private_data; > > > + > > > + fput(data->memfd); > > > + kfree(data); > > > + return 0; > > > +} > > > + > > > +static long inaccessible_fallocate(struct file *file, int mode, > > > + loff_t offset, loff_t len) > > > +{ > > > + struct inaccessible_data *data = file->f_mapping->private_data; > > > + struct file *memfd = data->memfd; > > > + int ret; > > > + > > > + if (mode & FALLOC_FL_PUNCH_HOLE) { > > > + if (!PAGE_ALIGNED(offset) || !PAGE_ALIGNED(len)) > > > + return -EINVAL; > > > + } > > > + > > > + ret = memfd->f_op->fallocate(memfd, mode, offset, len); > > > > I think that shmem_file_operations.fallocate is only set if > > CONFIG_TMPFS is enabled (shmem.c). Should there be a check at > > initialization that fallocate is set, or maybe a config dependency, or > > can we count on it always being enabled? > > It is already there: > > config MEMFD_CREATE > def_bool TMPFS || HUGETLBFS > > And we reject inaccessible memfd_create() for HUGETLBFS. > > But if we go with a separate syscall, yes, we need the dependency. I missed that, thanks. > > > > + inaccessible_notifier_invalidate(data, offset, offset + len); > > > + return ret; > > > +} > > > + > > > > <...> > > > > > +void inaccessible_register_notifier(struct file *file, > > > + struct inaccessible_notifier *notifier) > > > +{ > > > + struct inaccessible_data *data = file->f_mapping->private_data; > > > + > > > + mutex_lock(&data->lock); > > > + list_add(¬ifier->list, &data->notifiers); > > > + mutex_unlock(&data->lock); > > > +} > > > +EXPORT_SYMBOL_GPL(inaccessible_register_notifier); > > > > If the memfd wasn't marked as inaccessible, or more generally > > speaking, if the file isn't a memfd_inaccessible file, this ends up > > accessing an uninitialized pointer for the notifier list. Should there > > be a check for that here, and have this function return an error if > > that's not the case? > > I think it is "don't do that" category. inaccessible_register_notifier() > caller has to know what file it operates on, no? The thing is, you could oops the kernel from userspace. For that, all you have to do is a memfd_create without the MFD_INACCESSIBLE, followed by a KVM_SET_USER_MEMORY_REGION using that as the private_fd. I ran into this using my port of this patch series to arm64. Cheers, /fuad > -- > Kiryl Shutsemau / Kirill A. Shutemov