On Fri, Nov 24, 2023 at 3:18 PM <junxiao.bi@xxxxxxxxxx> wrote: > > On 11/24/23 9:29 AM, Song Liu wrote: > > > On Wed, Nov 8, 2023 at 10:22 AM Junxiao Bi <junxiao.bi@xxxxxxxxxx> wrote: > >> This reverts commit 5e2cf333b7bd5d3e62595a44d598a254c697cd74. > >> > >> That commit introduced the following race and can cause system hung. > >> > >> md_write_start: raid5d: > >> // mddev->in_sync == 1 > >> set "MD_SB_CHANGE_PENDING" > >> // running before md_write_start wakeup it > >> waiting "MD_SB_CHANGE_PENDING" cleared > >> >>>>>>>>> hung > >> wakeup mddev->thread > >> ... > >> waiting "MD_SB_CHANGE_PENDING" cleared > >> >>>> hung, raid5d should clear this flag > >> but get hung by same flag. > >> > >> The issue reverted commit fixing is fixed by last patch in a new way. > >> > >> Fixes: 5e2cf333b7bd ("md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d") > >> Signed-off-by: Junxiao Bi <junxiao.bi@xxxxxxxxxx> > > The set looks good to me. Thanks! > Thanks for the review. > > > > Quick question: from the earlier thread, the issue was observed in > > production. Have you reproduced the issue and thus verified the fix > > works as expected? > > I didn't try reproducing this since the system hung on the code where > the bad commit added, after revert it, this issue will not reproduce any > more. Thanks for the information. I applied the set to md-next. Song