On Thu, Oct 03, 2024 at 02:46:19PM +0200, Jan Kara wrote: > On Thu 03-10-24 05:18:30, Christoph Hellwig wrote: > > On Thu, Oct 03, 2024 at 01:45:55PM +0200, Jan Kara wrote: > > > /* Find next inode on the inode list eligible for processing */ > > > #define sb_inode_iter_next(sb, inode, old_inode, inode_eligible) \ > > > ({ \ > > > struct inode *ret = NULL; \ > > > > <snip> > > > > > ret; \ > > > }) > > > > How is this going to interact with calling into the file system > > to do the interaction, which is kinda the point of this series? > > Yeah, I was concentrated on the VFS bits and forgot why Dave wrote this > series in the first place. So this style of iterator isn't useful for what > Dave wants to achieve. Sorry for the noise. Still the possibility to have a > callback under inode->i_lock being able to do stuff and decide whether we I did that first, and turned into an utter mess the moment we step outside the existing iterator mechanism. I implemented a separate XFS icwalk function because having to hold the inode->i_lock between inode lookup and the callback function means we cannot do batched inode lookups from the radix tree. The existing icwalk code grabs 32 inodes at a time from the radix tree and validates them all, then runs the callback on them one at a time, then it drops them all. If the VFS inode callback requires the inode i_lock to be held and be atomic with the initial state checks, then we have to nest 32 spinlocks in what is effectively a random lock order. So I implemented a non-batched icwalk method, and it didn't get that much cleaner. It wasn't until I dropped the inode->i_lock from the callback API that everything cleaned up and the offload mechanism started to make sense. And it was this change that also makes it possible for XFs to use it's existing batched lookup mechanisms instead of the special case implementation that I wrote for this patch set. IOWs, we can't hold the inode->i_lock across lookup validation to callback if we want to provide freedom of implementation to the filesystem specific code. We aren't really concerned about performance of traversals, so I went with freedom of implementation over clunky locking semantics to optimise away a couple of atomic ops per inode for iget/iput calls. -Dave. -- Dave Chinner david@xxxxxxxxxxxxx