On 6/22/20 6:37 PM, Song Liu wrote: >>> >>> Thanks for the trace. Looks like we may have some issues with >>> MD_SB_CHANGE_PENDING. >>> Could you please try the attached patch? >> >> Should I run this along with pr_debugs from the previous patch enabled ? > > We don't need those pr_debug() here. > > Thanks, > Song > So with this patch attached, there is no extra output whatsoever - once it finished getting past this point: [ +0.371752] r5c_recovery_rewrite_data_only_stripes rewritten 20001 stripes to the journal, current ctx->pos 408461384 ctx->seq 866603361 [ +0.395000] r5c_recovery_rewrite_data_only_stripes rewritten 21001 stripes to the journal, current ctx->pos 408479568 ctx->seq 866604361 [ +0.371255] r5c_recovery_rewrite_data_only_stripes rewritten 22001 stripes to the journal, current ctx->pos 408496600 ctx->seq 866605361 [ +0.401013] r5c_recovery_rewrite_data_only_stripes rewritten 23001 stripes to the journal, current ctx->pos 408515472 ctx->seq 866606361 [ +0.370543] r5c_recovery_rewrite_data_only_stripes rewritten 24001 stripes to the journal, current ctx->pos 408532112 ctx->seq 866607361 [ +0.319253] r5c_recovery_rewrite_data_only_stripes done [ +0.061560] r5c_recovery_flush_data_only_stripes enter [ +0.075697] r5c_recovery_flush_data_only_stripes before wait_event That is, besides 'task <....> blocked for' traces or unless pr_debug()s were enabled. There were a few 'md_write_start set MD_SB_CHANGE_PENDING' *before* that (all of them likely related to another raid that is active at the moment, as these were happening during that lengthy r5c_recovery_flush_log() process).