On Sun, Nov 8, 2020 at 6:02 PM heming.zhao@xxxxxxxx <heming.zhao@xxxxxxxx> wrote: > > Please note, I gave two solutions for this bug in cover-letter. > This patch uses solution 2. For detail, please check cover-letter. > > Thank you. > [...] > > > > How to fix: > > > > There are two sides to fix (or break the dead loop): > > 1. on sending msg side, modify lock_comm, change it to return > > success/failed. > > This will make mdadm cmd return error when lock_comm is timeout. > > 2. on receiving msg side, process_metadata_update need to add error > > handling. > > currently, other msg types won't trigger error or error doesn't need > > to return sender. So only process_metadata_update need to modify. > > > > Ether of 1 & 2 can fix the hunging issue, but I prefer fix on both side. > > Similar comments on how to make the commit log easy to understand. Besides that, please split the change into two commits, for fix #1 and #2 respectively. Thanks, Song