On Tue, Jul 02, 2024 at 03:55:10PM +1000, Dave Chinner wrote: > On Sat, Jun 22, 2024 at 05:44:11PM +0800, Long Li wrote: > > On Tue, Jan 16, 2024 at 09:59:45AM +1100, Dave Chinner wrote: > > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > > > In the past we've had problems with lockdep false positives stemming > > > from inode locking occurring in memory reclaim contexts (e.g. from > > > superblock shrinkers). Lockdep doesn't know that inodes access from > > > above memory reclaim cannot be accessed from below memory reclaim > > > (and vice versa) but there has never been a good solution to solving > > > this problem with lockdep annotations. > > > > > > This situation isn't unique to inode locks - buffers are also locked > > > above and below memory reclaim, and we have to maintain lock > > > ordering for them - and against inodes - appropriately. IOWs, the > > > same code paths and locks are taken both above and below memory > > > reclaim and so we always need to make sure the lock orders are > > > consistent. We are spared the lockdep problems this might cause > > > by the fact that semaphores and bit locks aren't covered by lockdep. > > > > > > In general, this sort of lockdep false positive detection is cause > > > by code that runs GFP_KERNEL memory allocation with an actively > > > referenced inode locked. When it is run from a transaction, memory > > > allocation is automatically GFP_NOFS, so we don't have reclaim > > > recursion issues. So in the places where we do memory allocation > > > with inodes locked outside of a transaction, we have explicitly set > > > them to use GFP_NOFS allocations to prevent lockdep false positives > > > from being reported if the allocation dips into direct memory > > > reclaim. > > > > > > More recently, __GFP_NOLOCKDEP was added to the memory allocation > > > flags to tell lockdep not to track that particular allocation for > > > the purposes of reclaim recursion detection. This is a much better > > > way of preventing false positives - it allows us to use GFP_KERNEL > > > context outside of transactions, and allows direct memory reclaim to > > > proceed normally without throwing out false positive deadlock > > > warnings. > > > > Hi Dave, > > > > I recently encountered the following AA deadlock lockdep warning > > in Linux-6.9.0. This version of the kernel has currently merged > > your patch set. I believe this is a lockdep false positive warning. > > Yes, it is. > > > The xfs_dir_lookup_args() function is in a non-transactional context > > and allocates memory with the __GFP_NOLOCKDEP flag in xfs_buf_alloc_pages(). > > Even though __GFP_NOLOCKDEP can tell lockdep not to track that particular > > allocation for the purposes of reclaim recursion detection, it cannot > > completely replace __GFP_NOFS. > > We are not trying to replace GFP_NOFS with __GFP_NOLOCKDEP. What we > are trying to do is annotate the allocation sites where lockdep > false positives will occur. That way if we get a lockdep report from > a location that uses __GFP_NOLOCKDEP, we know that it is either a > false positive or there is some nested allocation that did not honor > __GFP_NOLOCKDEP. > > We've already fixed a bunch of nested allocations (e.g. kasan, > kmemleak, etc) to propagate the __GFP_NOLOCKDEP flag so they don't > generate false positives, either. So the amount of noise has already > been reduced. > > > Getting trapped in direct memory reclaim > > maybe trigger the AA deadlock warning as shown below. > > No, it can't. xfs_dir_lookup() can only lock referenced inodes. > xfs_reclaim_inodes_nr() can only lock unreferenced inodes. It is not > possible for the same inode to be both referenced and unreferenced > at the same time, therefore memory reclaim cannot self deadlock > through this path. Yes, I know. An AA deadlock couldn't happen in this situation because it's not the same inode, so it's just a lockdep false positive warning. > > I expected to see some situations like this when getting rid of > GFP_NOFS (because now memory reclaim runs in places it never used > to). Once I have an idea of the sorts of false positives that are > still being tripped over, I can formulate a plan to eradicate them, > too. Ok, memory reclaim may run in those places where GFP_NOFS is removed. Some new lockdep false positive warnings may appear. I hope this report can help you eradicate them in the future. Thanks for your reply. :) > > -Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx >