On Thu, Mar 10, 2022 at 10:09:01PM +0800, Chao Peng wrote: > From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> > > It maintains a memfile_notifier list in shmem_inode_info structure and > implements memfile_pfn_ops callbacks defined by memfile_notifier. It > then exposes them to memfile_notifier via > shmem_get_memfile_notifier_info. > > We use SGP_NOALLOC in shmem_get_lock_pfn since the pages should be > allocated by userspace for private memory. If there is no pages > allocated at the offset then error should be returned so KVM knows that > the memory is not private memory. > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> > Signed-off-by: Chao Peng <chao.p.peng@xxxxxxxxxxxxxxx> > --- > include/linux/shmem_fs.h | 4 +++ > mm/shmem.c | 76 ++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 80 insertions(+) > > diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h > index 2dde843f28ef..7bb16f2d2825 100644 > --- a/include/linux/shmem_fs.h > +++ b/include/linux/shmem_fs.h > @@ -9,6 +9,7 @@ > #include <linux/percpu_counter.h> > #include <linux/xattr.h> > #include <linux/fs_parser.h> > +#include <linux/memfile_notifier.h> > > /* inode in-kernel data */ > > @@ -28,6 +29,9 @@ struct shmem_inode_info { > struct simple_xattrs xattrs; /* list of xattrs */ > atomic_t stop_eviction; /* hold when working on inode */ > unsigned int xflags; /* shmem extended flags */ > +#ifdef CONFIG_MEMFILE_NOTIFIER > + struct memfile_notifier_list memfile_notifiers; > +#endif > struct inode vfs_inode; > }; > > diff --git a/mm/shmem.c b/mm/shmem.c > index 9b31a7056009..7b43e274c9a2 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -903,6 +903,28 @@ static struct folio *shmem_get_partial_folio(struct inode *inode, pgoff_t index) > return page ? page_folio(page) : NULL; > } > > +static void notify_fallocate(struct inode *inode, pgoff_t start, pgoff_t end) > +{ > +#ifdef CONFIG_MEMFILE_NOTIFIER > + struct shmem_inode_info *info = SHMEM_I(inode); > + > + memfile_notifier_fallocate(&info->memfile_notifiers, start, end); > +#endif > +} *notify_populate(), not fallocate. This is a notification that a range has been populated, not that the fallocate() syscall was run to populate the backing store of a file. i.e. fallocate is the name of a userspace filesystem API that can be used to manipulate the backing store of a file in various ways. It can both populate and punch away the backing store of a file, and some operations that fallocate() can run will do both (e.g. FALLOC_FL_ZERO_RANGE) and so could generate both notify_invalidate() and a notify_populate() events. Hence "fallocate" as an internal mm namespace or operation does not belong anywhere in core MM infrastructure - it should never get used anywhere other than the VFS/filesystem layers that implement the fallocate() syscall or use it directly. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx