On Thu, Jan 30 2025, Luis Henriques wrote: > Userspace filesystems can push data for a specific inode without it being > explicitly requested. This can be accomplished by using NOTIFY_STORE. > However, this may race against another process performing different > operations on the same inode. > > If, for example, there is a process reading from it, it may happen that it > will block waiting for data to be available (locking the folio), while the > FUSE server will also block trying to lock the same folio to update it with > the inode data. > > The easiest solution, as suggested by Miklos, is to allow the userspace > filesystem to skip locked folios. > > Link: https://lore.kernel.org/CH2PR14MB41040692ABC50334F500789ED6C89@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx > Reported-by: Teng Qin <tqin@xxxxxxxxxxxxxxx> > Originally-by: Miklos Szeredi <miklos@xxxxxxxxxx> > Signed-off-by: Luis Henriques <luis@xxxxxxxxxx> > --- > Hi! > > Here's v2. Other than fixing the bug pointed out by Bernd (thanks!), I've > also added an explanation to the 'XXX' comment. As a matter of fact, I've > took another look at that code, and I felt compelled to remove that comment, > as using PAGE_SIZE seems to be the right thing. > > Anyway, I'm still thinking that probably NOTIFY_STORE should *always* have > this behaviour, without the need for userspace to explicitly setting a flag. Gentle ping. I was wondering if you have any thoughts on this patch. Specially regarding the behaviour change I'm suggesting above. (Also, as I've mentioned before, I'm using the 'Originally-by' tag; not sure this is the right thing to do. Obviously, I'm fine dropping my s-o-b, as I'm not the original author.) Cheers, -- Luís > Changes since v1: > - Only skip if __filemap_get_folio() returns -EAGAIN (Bernd) > > fs/fuse/dev.c | 30 +++++++++++++++++++++++------- > include/uapi/linux/fuse.h | 8 +++++++- > 2 files changed, 30 insertions(+), 8 deletions(-) > > diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c > index 27ccae63495d..309651f82ca4 100644 > --- a/fs/fuse/dev.c > +++ b/fs/fuse/dev.c > @@ -1630,6 +1630,7 @@ static int fuse_notify_store(struct fuse_conn *fc, unsigned int size, > unsigned int num; > loff_t file_size; > loff_t end; > + int fgp_flags = FGP_LOCK | FGP_ACCESSED | FGP_CREAT; > > err = -EINVAL; > if (size < sizeof(outarg)) > @@ -1645,6 +1646,9 @@ static int fuse_notify_store(struct fuse_conn *fc, unsigned int size, > > nodeid = outarg.nodeid; > > + if (outarg.flags & FUSE_NOTIFY_STORE_NOWAIT) > + fgp_flags |= FGP_NOWAIT; > + > down_read(&fc->killsb); > > err = -ENOENT; > @@ -1668,14 +1672,26 @@ static int fuse_notify_store(struct fuse_conn *fc, unsigned int size, > struct page *page; > unsigned int this_num; > > - folio = filemap_grab_folio(mapping, index); > - err = PTR_ERR(folio); > - if (IS_ERR(folio)) > - goto out_iput; > + folio = __filemap_get_folio(mapping, index, fgp_flags, > + mapping_gfp_mask(mapping)); > + err = PTR_ERR_OR_ZERO(folio); > + if (err) { > + if (!(outarg.flags & FUSE_NOTIFY_STORE_NOWAIT) || > + (err != -EAGAIN)) > + goto out_iput; > + page = NULL; > + /* XXX is it OK to use PAGE_SIZE here? */ > + this_num = min_t(unsigned int, num, PAGE_SIZE - offset); > + } else { > + page = &folio->page; > + this_num = min_t(unsigned int, num, > + folio_size(folio) - offset); > + } > > - page = &folio->page; > - this_num = min_t(unsigned, num, folio_size(folio) - offset); > err = fuse_copy_page(cs, &page, offset, this_num, 0); > + if (!page) > + goto skip; > + > if (!folio_test_uptodate(folio) && !err && offset == 0 && > (this_num == folio_size(folio) || file_size == end)) { > folio_zero_segment(folio, this_num, folio_size(folio)); > @@ -1683,7 +1699,7 @@ static int fuse_notify_store(struct fuse_conn *fc, unsigned int size, > } > folio_unlock(folio); > folio_put(folio); > - > +skip: > if (err) > goto out_iput; > > diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h > index e9e78292d107..59725f89340e 100644 > --- a/include/uapi/linux/fuse.h > +++ b/include/uapi/linux/fuse.h > @@ -576,6 +576,12 @@ struct fuse_file_lock { > */ > #define FUSE_EXPIRE_ONLY (1 << 0) > > +/** > + * notify_store flags > + * FUSE_NOTIFY_STORE_NOWAIT: skip locked pages > + */ > +#define FUSE_NOTIFY_STORE_NOWAIT (1 << 0) > + > /** > * extension type > * FUSE_MAX_NR_SECCTX: maximum value of &fuse_secctx_header.nr_secctx > @@ -1075,7 +1081,7 @@ struct fuse_notify_store_out { > uint64_t nodeid; > uint64_t offset; > uint32_t size; > - uint32_t padding; > + uint32_t flags; > }; > > struct fuse_notify_retrieve_out {