On Thu 22-08-19 16:26:42, Vlastimil Babka wrote: > On 8/22/19 3:17 PM, Dave Chinner wrote: > > On Thu, Aug 22, 2019 at 02:19:04PM +0200, Vlastimil Babka wrote: > >> On 8/22/19 2:07 PM, Dave Chinner wrote: > >> > On Thu, Aug 22, 2019 at 01:14:30PM +0200, Vlastimil Babka wrote: > >> > > >> > No, the problem is this (using kmalloc as a general term for > >> > allocation, whether it be kmalloc, kmem_cache_alloc, alloc_page, etc) > >> > > >> > some random kernel code > >> > kmalloc(GFP_KERNEL) > >> > reclaim > >> > PF_MEMALLOC > >> > shrink_slab > >> > xfs_inode_shrink > >> > XFS_ILOCK > >> > xfs_buf_allocate_memory() > >> > kmalloc(GFP_KERNEL) > >> > > >> > And so locks on inodes in reclaim are seen below reclaim. Then > >> > somewhere else we have: > >> > > >> > some high level read-only xfs code like readdir > >> > XFS_ILOCK > >> > xfs_buf_allocate_memory() > >> > kmalloc(GFP_KERNEL) > >> > reclaim > >> > > >> > And this one throws false positive lockdep warnings because we > >> > called into reclaim with XFS_ILOCK held and GFP_KERNEL alloc > >> > >> OK, and what exactly makes this positive a false one? Why can't it continue like > >> the first example where reclaim leads to another XFS_ILOCK, thus deadlock? > > > > Because above reclaim we only have operations being done on > > referenced inodes, and below reclaim we only have unreferenced > > inodes. We never lock the same inode both above and below reclaim > > at the same time. > > > > IOWs, an operation above reclaim cannot see, access or lock > > unreferenced inodes, except in inode write clustering, and that uses > > trylocks so cannot deadlock with reclaim. > > > > An operation below reclaim cannot see, access or lock referenced > > inodes except during inode write clustering, and that uses trylocks > > so cannot deadlock with code above reclaim. > > Thanks for elaborating. Perhaps lockdep experts (not me) would know how to > express that. If not possible, then replacing GFP_NOFS with __GFP_NOLOCKDEP > should indeed suppress the warning, while allowing FS reclaim. This was certainly my hope to happen when introducing __GFP_NOLOCKDEP. I couldn't have done the second step because that requires a deep understanding of the code in question which is beyond my capacity. It seems we still haven't found a brave soul to start converting GFP_NOFS to __GFP_NOLOCKDEP. And it would be really appreciated. Thanks. -- Michal Hocko SUSE Labs