Hi Andrew and Vitaly, I do agree that patch ee8f7fcbe638 ("ocfs2/dlm: continue to purge recovery lockres when recovery master goes down", 2016-08-02) introduced an issue. It makes DLM recovery can't pick up a new master for an existed lock resource whose owner died seconds ago. But this patch truly solves another issue. So I think we can't just revert this patch but to give a fix to it. Thanks, Changwei On 2017/10/11 3:38, Andrew Morton wrote: > On Tue, 10 Oct 2017 14:06:41 -0400 Vitaly Mayatskih <v.mayatskih@xxxxxxxxx> wrote: > >> * ocfs2-dlm-continue-to-purge-recovery-lockres-when-recovery >> -master-goes-down.patch >> >> This one completely broke two node cluster use case: when one node dies, >> the other one either eventually crashes (~4.14-rc4) or locks up (pre-4.14). > > Are you sure? > > Are you able to confirm that reverting this patch (ee8f7fcbe638b07e8) > and only this patch fixes up current mainline kernels? > > Are you able to supply more info on the crashes and lockups so that the > ocfs2 developers can understand the failures? > > Thanks. >