On Tue, Jun 23, 2020 at 6:17 AM Michal Soltys <msoltyspl@xxxxxxxxx> wrote: > > On 6/22/20 6:37 PM, Song Liu wrote: > >>> > >>> Thanks for the trace. Looks like we may have some issues with > >>> MD_SB_CHANGE_PENDING. > >>> Could you please try the attached patch? > >> > >> Should I run this along with pr_debugs from the previous patch enabled ? > > > > We don't need those pr_debug() here. > > > > Thanks, > > Song > > > > So with this patch attached, there is no extra output whatsoever - once it finished getting past this point: > > [ +0.371752] r5c_recovery_rewrite_data_only_stripes rewritten 20001 stripes to the journal, current ctx->pos 408461384 ctx->seq 866603361 > [ +0.395000] r5c_recovery_rewrite_data_only_stripes rewritten 21001 stripes to the journal, current ctx->pos 408479568 ctx->seq 866604361 > [ +0.371255] r5c_recovery_rewrite_data_only_stripes rewritten 22001 stripes to the journal, current ctx->pos 408496600 ctx->seq 866605361 > [ +0.401013] r5c_recovery_rewrite_data_only_stripes rewritten 23001 stripes to the journal, current ctx->pos 408515472 ctx->seq 866606361 > [ +0.370543] r5c_recovery_rewrite_data_only_stripes rewritten 24001 stripes to the journal, current ctx->pos 408532112 ctx->seq 866607361 > [ +0.319253] r5c_recovery_rewrite_data_only_stripes done > [ +0.061560] r5c_recovery_flush_data_only_stripes enter > [ +0.075697] r5c_recovery_flush_data_only_stripes before wait_event > > That is, besides 'task <....> blocked for' traces or unless pr_debug()s were enabled. > > There were a few 'md_write_start set MD_SB_CHANGE_PENDING' *before* that (all of them likely related to another raid that is active at the moment, as these were happening during that lengthy r5c_recovery_flush_log() process). Hmm.. this is weird, as I think I marked every instance of set_bit MD_SB_CHANGE_PENDING. Would you mind confirm those are to the other array with something like: diff --git i/drivers/md/md.c w/drivers/md/md.c index dbbc8a50e2ed2..e91acfdcec032 100644 --- i/drivers/md/md.c +++ w/drivers/md/md.c @@ -8480,7 +8480,7 @@ bool md_write_start(struct mddev *mddev, struct bio *bi) mddev->in_sync = 0; set_bit(MD_SB_CHANGE_CLEAN, &mddev->sb_flags); set_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags); - pr_info("%s set MD_SB_CHANGE_PENDING\n", __func__); + pr_info("%s: md: %s set MD_SB_CHANGE_PENDING\n", __func__, mdname(mddev)); md_wakeup_thread(mddev->thread); did_change = 1; } Thanks, Song