Re: Assemblin journaled array fails

Michal Soltys <msoltyspl@xxxxxxxxx> · Tue, 23 Jun 2020 15:17:36 +0200

On 6/22/20 6:37 PM, Song Liu wrote:
>>>
>>> Thanks for the trace. Looks like we may have some issues with
>>> MD_SB_CHANGE_PENDING.
>>> Could you please try the attached patch?
>>
>> Should I run this along with pr_debugs from the previous patch enabled ?
> 
> We don't need those pr_debug() here.
> 
> Thanks,
> Song
> 

So with this patch attached, there is no extra output whatsoever - once it finished getting past this point:

[  +0.371752] r5c_recovery_rewrite_data_only_stripes rewritten 20001 stripes to the journal, current ctx->pos 408461384 ctx->seq 866603361                           
[  +0.395000] r5c_recovery_rewrite_data_only_stripes rewritten 21001 stripes to the journal, current ctx->pos 408479568 ctx->seq 866604361                           
[  +0.371255] r5c_recovery_rewrite_data_only_stripes rewritten 22001 stripes to the journal, current ctx->pos 408496600 ctx->seq 866605361                           
[  +0.401013] r5c_recovery_rewrite_data_only_stripes rewritten 23001 stripes to the journal, current ctx->pos 408515472 ctx->seq 866606361                           
[  +0.370543] r5c_recovery_rewrite_data_only_stripes rewritten 24001 stripes to the journal, current ctx->pos 408532112 ctx->seq 866607361                           
[  +0.319253] r5c_recovery_rewrite_data_only_stripes done
[  +0.061560] r5c_recovery_flush_data_only_stripes enter
[  +0.075697] r5c_recovery_flush_data_only_stripes before wait_event

That is, besides 'task <....> blocked for' traces or unless pr_debug()s were enabled.

There were a few 'md_write_start set MD_SB_CHANGE_PENDING' *before* that (all of them likely related to another raid that is active at the moment, as these were happening during that lengthy r5c_recovery_flush_log() process).