On Thu, Aug 22, 2019 at 02:19:04PM +0200, Vlastimil Babka wrote: > On 8/22/19 2:07 PM, Dave Chinner wrote: > > On Thu, Aug 22, 2019 at 01:14:30PM +0200, Vlastimil Babka wrote: > > > > No, the problem is this (using kmalloc as a general term for > > allocation, whether it be kmalloc, kmem_cache_alloc, alloc_page, etc) > > > > some random kernel code > > kmalloc(GFP_KERNEL) > > reclaim > > PF_MEMALLOC > > shrink_slab > > xfs_inode_shrink > > XFS_ILOCK > > xfs_buf_allocate_memory() > > kmalloc(GFP_KERNEL) > > > > And so locks on inodes in reclaim are seen below reclaim. Then > > somewhere else we have: > > > > some high level read-only xfs code like readdir > > XFS_ILOCK > > xfs_buf_allocate_memory() > > kmalloc(GFP_KERNEL) > > reclaim > > > > And this one throws false positive lockdep warnings because we > > called into reclaim with XFS_ILOCK held and GFP_KERNEL alloc > > OK, and what exactly makes this positive a false one? Why can't it continue like > the first example where reclaim leads to another XFS_ILOCK, thus deadlock? Because above reclaim we only have operations being done on referenced inodes, and below reclaim we only have unreferenced inodes. We never lock the same inode both above and below reclaim at the same time. IOWs, an operation above reclaim cannot see, access or lock unreferenced inodes, except in inode write clustering, and that uses trylocks so cannot deadlock with reclaim. An operation below reclaim cannot see, access or lock referenced inodes except during inode write clustering, and that uses trylocks so cannot deadlock with code above reclaim. FWIW, I'm trying to make the inode writeback clustering go away from reclaim at the moment, so even that possibility is going away soon. That will change everything to trylocks in reclaim context, so lockdep is going to stop tracking it entirely. Hmmm - maybe we're getting to the point where we actually don't need GFP_NOFS/PF_MEMALLOC_NOFS at all in XFS anymore..... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx