On Wed, Mar 20, 2024 at 05:58:53PM +0100, Andre Noll wrote: > On Wed, Mar 20, 07:53, Darrick J. Wong wrote > > On Wed, Mar 20, 2024 at 09:39:57AM +0100, Andre Noll wrote: > > > On Tue, Mar 19, 11:16, Dave Chinner wrote > > > > + /* > > > > + * Well, that sucks. Put the inode back on the inactive queue. > > > > + * Do this while still under the ILOCK so that we can set the > > > > + * NEED_INACTIVE flag and clear the INACTIVATING flag an not > > > > + * have another lookup race with us before we've finished > > > > + * putting the inode back on the inodegc queue. > > > > + */ > > > > + spin_unlock(&ip->i_flags_lock); > > > > + ip->i_flags |= XFS_NEED_INACTIVE; > > > > + ip->i_flags &= ~XFS_INACTIVATING; > > > > + spin_unlock(&ip->i_flags_lock); > > > > > > This doesn't look right. Shouldn't the first spin_unlock() be spin_lock()? > > > > Yes. So much for my hand inspection of code. :( > > Given enough hand inspections, all bugs are shallow :) Sparse should have found that, if I ran it. :/ Ah, but sparse gets confused by the fact that the return from the function may or may not have unlocked stuff: fs/xfs/xfs_icache.c:355:9: warning: context imbalance in 'xfs_iget_recycle' - unexpected unlock fs/xfs/xfs_icache.c:414:28: warning: context imbalance in 'xfs_iget_reactivate' - unexpected unlock fs/xfs/xfs_icache.c:656:28: warning: context imbalance in 'xfs_iget_cache_hit' - different lock contexts for basic block So if I fix that (that'll be patch 5 for this series), i get: CC fs/xfs/xfs_icache.o CHECK fs/xfs/xfs_icache.c fs/xfs/xfs_icache.c:459:28: warning: context imbalance in 'xfs_iget_reactivate' - unexpected unlock Yup, sparse now catches the unbalanced locking. I just haven't thought to run sparse on XFS recently - running sparse on a full kernel build is just .... awful. I think I'll change my build script so that when I do an '--xfs-only' built it also enables sparse as it's only rebuilding fs/xfs at that point.... > > (Doesn't simple lock debugging catch these sorts of things?) > > Maybe this error path doesn't get exercised because xfs_reinit_inode() > never fails. AFAICT, it can only fail if security_inode_alloc() > can't allocate the composite inode blob. Which syzkaller triggers every so often. I also do all my testing with selinux enabled, so security_inode_alloc() is actually being exercised and definitely has the potential to fail on my small memory configs... > > ((It sure would be nice if locking returned a droppable "object" to do > > the unlock ala Rust and then spin_lock could be __must_check.)) > > There's the *LOCK_GUARD* macros which employ gcc's cleanup attribute > to automatically call e.g. spin_unlock() when a variable goes out of > scope (see 54da6a0924311). IMO, the LOCK_GUARD stuff is an awful anti-pattern. It means some error paths -look broken- because they lack unlocks, and we have to explicitly change code to return from functions with the guarded locks held. This is a diametrically opposed locking pattern to the existing non-guarded lockign patterns - correct behaviour in one pattern is broken behaviour in the other, and vice versa. That's just -insane- from a code maintenance point of view. And they are completely useless for anythign complex like these XFS icache functions because the lock scope is not balanced across functions. The lock can also be taken by functions called within the guard scope, and so using guarded lock scoping would result in deadlocks. i.e. xfs_inodegc_queue() needs to take the i_flags_lock, so it must be dropped before we call that. So, yeah, lock guards seem to me to be largely just a "look ma, no need for rust because we can mightily abuse the C preprocessor!" anti-pattern looking for a problem to solve. -Dave. -- Dave Chinner david@xxxxxxxxxxxxx