On Mon, Jul 31, 2023 at 11:33:05AM +1000, Dave Chinner wrote: > On Thu, Jul 27, 2023 at 04:27:30PM +0200, Christian Brauner wrote: > > On Thu, Jul 27, 2023 at 07:51:19PM +0800, Hao Xu wrote: > > > I actually saw this semaphore, and there is another xfs lock in > > > file_accessed > > > --> touch_atime > > > --> inode_update_time > > > --> inode->i_op->update_time == xfs_vn_update_time > > > > > > Forgot to point them out in the cover-letter..., I didn't modify them > > > since I'm not very sure about if we should do so, and I saw Stefan's > > > patchset didn't modify them too. > > > > > > My personnal thinking is we should apply trylock logic for this > > > inode->i_rwsem. For xfs lock in touch_atime, we should do that since it > > > doesn't make sense to rollback all the stuff while we are almost at the > > > end of getdents because of a lock. > > > > That manoeuvres around the problem. Which I'm slightly more sensitive > > too as this review is a rather expensive one. > > > > Plus, it seems fixable in at least two ways: > > > > For both we need to be able to tell the filesystem that a nowait atime > > update is requested. Simple thing seems to me to add a S_NOWAIT flag to > > file_time_flags and passing that via i_op->update_time() which already > > has a flag argument. That would likely also help kiocb_modified(). > > Wait - didn't we already fix this for mtime updates on IOCB_NOWAIT > modification operations? Yeah, we did: > > kiocb_modified(iocb) > file_modified_flags(iocb->ki_file, iocb->ki_flags) > .... > ret = inode_needs_update_time() > if (ret <= 0) > return ret; > if (flags & IOCB_NOWAIT) > return -EAGAIN; > <does timestamp update> > > > file_accessed() > > -> touch_atime() > > -> inode_update_time() > > -> i_op->update_time == xfs_vn_update_time() > > Yeah, so this needs the same treatment as file_modified_flags() - > touch_atime() needs a flag variant that passes IOCB_NOWAIT, and > after atime_needs_update() returns trues we should check IOCB_NOWAIT > and return EAGAIN if it is set. That will punt the operation that > needs to the update to a worker thread that can block.... As I tried to explain, I would prefer if we could inform the filesystem through i_op->update_time() itself that this is async and give the filesystem the ability to try and acquire the locks it needs and return EAGAIN from i_op->update_time() itself if it can't acquire them. > > > Then we have two options afaict: > > > > (1) best-effort atime update > > > > file_accessed() already has the builtin assumption that updating atime > > might fail for other reasons - see the comment in there. So it is > > somewhat best-effort already. > > > > (2) move atime update before calling into filesystem > > > > If we want to be sure that access time is updated when a readdir request > > is issued through io_uring then we need to have file_accessed() give a > > return value and expose a new helper for io_uring or modify > > vfs_getdents() to do something like: > > > > vfs_getdents() > > { > > if (nowait) > > down_read_trylock() > > > > if (!IS_DEADDIR(inode)) { > > ret = file_accessed(file); > > if (ret == -EAGAIN) > > goto out_unlock; > > > > f_op->iterate_shared() > > } > > } > > Yup, that's the sort of thing that needs to be done. > > But as I said in the "llseek for io-uring" thread, we need to stop > the game of whack-a-mole passing random nowait boolean flags to VFS > operations before it starts in earnest. We really need a common > context structure (like we have a kiocb for IO operations) that > holds per operation control state so we have consistency across all > the operations that we need different behaviours for. Yes, I tend to agree and thought about the same. But right now we don't have a lot of context. So I would lean towards a flag argument at most. But I also wouldn't consider it necessarily wrong to start with booleans or a flag first and in a couple of months if the need for more context arises we know what kind of struct we want or need.