The patch titled Subject: ocfs2: initialize ip_next_orphan has been added to the -mm tree. Its filename is ocfs2-initialize-ip_next_orphan.patch This patch should soon appear at https://ozlabs.org/~akpm/mmots/broken-out/ocfs2-initialize-ip_next_orphan.patch and later at https://ozlabs.org/~akpm/mmotm/broken-out/ocfs2-initialize-ip_next_orphan.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Wengang Wang <wen.gang.wang@xxxxxxxxxx> Subject: ocfs2: initialize ip_next_orphan Though problem if found on a lower 4.1.12 kernel, I think upstream has same issue. In one node in the cluster, there is the following callback trace: # cat /proc/21473/stack [<ffffffffc09a2f06>] __ocfs2_cluster_lock.isra.36+0x336/0x9e0 [ocfs2] [<ffffffffc09a4481>] ocfs2_inode_lock_full_nested+0x121/0x520 [ocfs2] [<ffffffffc09b2ce2>] ocfs2_evict_inode+0x152/0x820 [ocfs2] [<ffffffff8122b36e>] evict+0xae/0x1a0 [<ffffffff8122bd26>] iput+0x1c6/0x230 [<ffffffffc09b60ed>] ocfs2_orphan_filldir+0x5d/0x100 [ocfs2] [<ffffffffc0992ae0>] ocfs2_dir_foreach_blk+0x490/0x4f0 [ocfs2] [<ffffffffc099a1e9>] ocfs2_dir_foreach+0x29/0x30 [ocfs2] [<ffffffffc09b7716>] ocfs2_recover_orphans+0x1b6/0x9a0 [ocfs2] [<ffffffffc09b9b4e>] ocfs2_complete_recovery+0x1de/0x5c0 [ocfs2] [<ffffffff810a1399>] process_one_work+0x169/0x4a0 [<ffffffff810a1bcb>] worker_thread+0x5b/0x560 [<ffffffff810a7a2b>] kthread+0xcb/0xf0 [<ffffffff816f5d21>] ret_from_fork+0x61/0x90 [<ffffffffffffffff>] 0xffffffffffffffff The above stack is not reasonable, the final iput shouldn't happen in ocfs2_orphan_filldir() function. Looking at the code, 2067 /* Skip inodes which are already added to recover list, since dio may 2068 * happen concurrently with unlink/rename */ 2069 if (OCFS2_I(iter)->ip_next_orphan) { 2070 iput(iter); 2071 return 0; 2072 } 2073 The logic thinks the inode is already in recover list on seeing ip_next_orphan is non-NULL, so it skip this inode after dropping a reference which incremented in ocfs2_iget(). While, if the inode is already in recover list, it should have another reference and the iput() at line 2070 should not be the final iput (dropping the last reference). So I don't think the inode is really in the recover list (no vmcore to confirm). Note that ocfs2_queue_orphans(), though not shown up in the call back trace, is holding cluster lock on the orphan directory when looking up for unlinked inodes. The on disk inode eviction could involve a lot of IOs which may need long time to finish. That means this node could hold the cluster lock for very long time, that can lead to the lock requests (from other nodes) to the orhpan directory hang for long time. Looking at more on ip_next_orphan, I found it's not initialized when allocating a new ocfs2_inode_info structure. Fix: initialize ip_next_orphan as NULL. Link: https://lkml.kernel.org/r/20201109171746.27884-1-wen.gang.wang@xxxxxxxxxx Signed-off-by: Wengang Wang <wen.gang.wang@xxxxxxxxxx> Reviewed-by: Joseph Qi <joseph.qi@xxxxxxxxxxxxxxxxx> Cc: Mark Fasheh <mark@xxxxxxxxxx> Cc: Joel Becker <jlbec@xxxxxxxxxxxx> Cc: Junxiao Bi <junxiao.bi@xxxxxxxxxx> Cc: Changwei Ge <gechangwei@xxxxxxx> Cc: Gang He <ghe@xxxxxxxx> Cc: Jun Piao <piaojun@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- fs/ocfs2/super.c | 1 + 1 file changed, 1 insertion(+) --- a/fs/ocfs2/super.c~ocfs2-initialize-ip_next_orphan +++ a/fs/ocfs2/super.c @@ -1713,6 +1713,7 @@ static void ocfs2_inode_init_once(void * oi->ip_blkno = 0ULL; oi->ip_clusters = 0; + oi->ip_next_orphan = NULL; ocfs2_resv_init_once(&oi->ip_la_data_resv); _ Patches currently in -mm which might be from wen.gang.wang@xxxxxxxxxx are ocfs2-initialize-ip_next_orphan.patch