On 2021/02/15 21:45, Jan Kara wrote: > On Sat 13-02-21 23:26:37, Tetsuo Handa wrote: >> Excuse me, but it seems to me that nothing prevents >> ext4_xattr_set_handle() from reaching ext4_xattr_inode_lookup_create() >> without memalloc_nofs_save() when hitting ext4_get_nojournal() path. >> Will you explain when ext4_get_nojournal() path is executed? > > That's a good question but sadly I don't think that's it. > ext4_get_nojournal() is called when the filesystem is created without a > journal. In that case we also don't acquire jbd2_handle lockdep map. In the > syzbot report we can see: Since syzbot can test filesystem images, syzbot might have tested a filesystem image created both with and without journal within this boot. > > kswapd0/2246 is trying to acquire lock: > ffff888041a988e0 (jbd2_handle){++++}-{0:0}, at: start_this_handle+0xf81/0x1380 fs/jbd2/transaction.c:444 > > but task is already holding lock: > ffffffff8be892c0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30 mm/page_alloc.c:5195 > > So this filesystem has very clearly been created with a journal. Also the > journal lockdep tracking machinery uses: While locks held by kswapd0/2246 are fs_reclaim, shrinker_rwsem, &type->s_umount_key#38 and jbd2_handle, isn't the dependency lockdep considers problematic is Chain exists of: jbd2_handle --> &ei->xattr_sem --> fs_reclaim Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(&ei->xattr_sem); lock(fs_reclaim); lock(jbd2_handle); where CPU0 is kswapd/2246 and CPU1 is the case of ext4_get_nojournal() path? If someone has taken jbd2_handle and &ei->xattr_sem in this order, isn't this dependency true? > > rwsem_acquire_read(&journal->j_trans_commit_map, 0, 0, _THIS_IP_); > > so a lockdep key is per-filesystem. Thus it is not possible that lockdep > would combine lock dependencies from two different filesystems. > > But I guess we could narrow the search for this problem by adding WARN_ONs > to ext4_xattr_set_handle() and ext4_xattr_inode_lookup_create() like: > > WARN_ON(ext4_handle_valid(handle) && !(current->flags & PF_MEMALLOC_NOFS)); > > It would narrow down a place in which PF_MEMALLOC_NOFS flag isn't set > properly... At least that seems like the most plausible way forward to me. You can use CONFIG_DEBUG_AID_FOR_SYZBOT for adding such WARN_ONs on linux-next.