On Mon, Feb 24 2025, Miklos Szeredi wrote: > On Mon, 24 Feb 2025 at 15:30, Luis Henriques <luis@xxxxxxxxxx> wrote: >> >> On Mon, Feb 24 2025, Miklos Szeredi wrote: >> >> > On Thu, 30 Jan 2025 at 11:16, Luis Henriques <luis@xxxxxxxxxx> wrote: >> >> >> >> Userspace filesystems can push data for a specific inode without it being >> >> explicitly requested. This can be accomplished by using NOTIFY_STORE. >> >> However, this may race against another process performing different >> >> operations on the same inode. >> >> >> >> If, for example, there is a process reading from it, it may happen that it >> >> will block waiting for data to be available (locking the folio), while the >> >> FUSE server will also block trying to lock the same folio to update it with >> >> the inode data. >> >> >> >> The easiest solution, as suggested by Miklos, is to allow the userspace >> >> filesystem to skip locked folios. >> > >> > Not sure. >> > >> > The easiest solution is to make the server perform the two operations >> > independently. I.e. never trigger a notification from a request. >> > >> > This is true of other notifications, e.g. doing FUSE_NOTIFY_DELETE >> > during e.g. FUSE_RMDIR will deadlock on i_mutex. >> >> Hmmm... OK, the NOTIFY_DELETE and NOTIFY_INVAL_ENTRY deadlocks are >> documented (in libfuse, at least). So, maybe this one could be added to >> the list of notifications that could deadlock. However, IMHO, it would be >> great if this could be fixed instead. >> >> > Or am I misunderstanding the problem? >> >> I believe the initial report[1] actually adds a specific use-case where >> the deadlock can happen when the server performs the two operations >> independently. For example: >> >> - An application reads 4K of data at offset 0 >> - The server gets a read request. It performs the read, and gets more >> data than the data requested (say 4M) >> - It caches this data in userspace and replies to VFS with 4K of data >> - The server does a notify_store with the reminder data >> - In the meantime the userspace application reads more 4K at offset 4K >> >> The last 2 operations can race and the server may deadlock if the >> application already has locked the page where data will be read into. > > I don't see the deadlock. If the race was won by the read, then it > will proceed with FUSE_READ and fetch the data from the server. When > this is finished, NOTIFY_STORE will overwrite the page with the same > data. OK, that makes sense. Took a bit to go through all this again, but I agree that the only thing to do in then is probably to add a warning to the libfuse API documentation, in fuse_lowlevel_notify_store(), as shown below. (I'll prepare an MR for that.) Thank you, Miklos. Cheers, -- Luís diff --git a/include/fuse_lowlevel.h b/include/fuse_lowlevel.h index 93bcba296c2d..d1f9717347da 100644 --- a/include/fuse_lowlevel.h +++ b/include/fuse_lowlevel.h @@ -1845,6 +1845,10 @@ int fuse_lowlevel_notify_delete(struct fuse_session *se, * If the stored data overflows the current file size, then the size * is extended, similarly to a write(2) on the filesystem. * + * To avoid a deadlock this function must not be called while executing + * a related filesystem operation (e.g. while replying to a FUSE_READ + * request). + * * If this function returns an error, then the store wasn't fully * completed, but it may have been partially completed. *