On 7/20/22 4:26 PM, Marion & Christophe JAILLET wrote: > > Le 20/07/2022 à 03:59, Joseph Qi a écrit : >> >> On 7/19/22 9:25 PM, Christophe JAILLET wrote: >>> Le 19/07/2022 à 12:24, David Laight a écrit : >>>> From: Christophe JAILLET >>>>> Sent: 19 July 2022 11:02 >>>>> >>>>> 'node_map_lock' is a spinlock only used to protect calls to set_bit(), >>>>> clear_bit() and test_bit(). >>>>> >>>>> {set|clear}_bit() are already atomic and don't need this extra spinlock. >>>>> test_bit() only reads the bitmap for a given bit. >>>>> >>>>> Remove this useless spinlock. >>>> It looks to me like the calling code is racy >>>> unless there is another lock in the callers. >>> The call chains are: >>> ocfs2_recover_orphans() >>> ocfs2_mark_recovering_orphan_dir() >>> spin_lock(&osb->osb_lock); <-- osb_lock spinlock >>> ocfs2_node_map_set_bit() <-- uses node_map_lock >>> ... >>> spin_unlock(&osb->osb_lock); >>> ... >>> ocfs2_clear_recovering_orphan_dir() >>> ocfs2_node_map_clear_bit() <-- uses node_map_lock >>> osb_lock is NOT taken >>> >>> >>> ocfs2_check_orphan_recovery_state() >>> spin_lock(&osb->osb_lock); <-- osb_lock spinlock >>> ... >>> ocfs2_node_map_test_bit() <-- uses node_map_lock >>> ... >>> spin_unlock(&osb->osb_lock); >>> >>> >>> So the code looks already protected by the 'osb_lock' spinlock, but I don't know this code and ocfs2_mark_recovering_orphan_dir() looks tricky to me. (so some other eyes are much welcome) >> osb_lock is to protect osb filed such as 'osb_orphan_wipes', while >> node_map_lock is to protect the node map 'osb_recovering_orphan_dirs' >> specifically. > > Thanks for this explanation. > > But does "node_map_lock" really protects anything? > It is just around some atomic function calls which shouldn't need any, right? > > test_bit() is not documented as atomic, but {clear|set}_bit() could be executed just before or just after it with the current locking mechanism, so I don't really see how it would make a difference. > > I don't understand the logic of this lock here. > > Can you elaborate? These code are introduced long time ago... Refer to commit b4df6ed8db0c "[PATCH] ocfs2: fix orphan recovery deadlock", I guess it plays a role 'barrier' and make sure test node map is executed prior than signal orphan recovery thread. In other words, to serialize evict inode and orphan recovery. Thanks, Joseph