Re: [PATCH 0/2] md: change approach to parallel flush requests.

Xiao Ni <xni@xxxxxxxxxx> · Wed, 27 Mar 2019 03:41:23 -0400 (EDT)

----- Original Message -----
> From: "NeilBrown" <neilb@xxxxxxxx>
> To: "Xiao Ni" <xni@xxxxxxxxxx>, "Ming Lei" <ming.lei@xxxxxxxxxx>
> Cc: linux-raid@xxxxxxxxxxxxxxx
> Sent: Wednesday, March 27, 2019 8:19:28 AM
> Subject: [PATCH 0/2] md: change approach to parallel flush requests.
> 
> Hi Xiao,
>  I was looking at the issue of parallel flush requests being slow as a
>  customer reported it against a kernel of ours that doesn't have your
>  patch.
> 
>  I reviewed you patch more carefully and noticed a problem.  It calls
>  submit_bio() multiple times from within a make_request_fn function.
>  Calling submit_bio once is perfectly safe.  Calling it twice requires
>  care and usually involves one of the call submitting part of the
>  original, and the other resubmitting the original with that part
>  removed.
>  Calling it more often is rarely safe.  The problem is that all these
>  bios get queued on current->biolist and don't even start being
>  processed until the top level make_request_fn completes.  This can
>  result in deadlocks.

Hi Neil

Thanks for pointing out this. 
A: The flush bio from file system is waiting at ret = q->make_request_fn(q, bio);  
B: Function md_flush_request submits flush bios to raid member disks. These bios 
will be added into bio_list_on_stack[0] and return. 
A: make_request_fn returns and handle those bios on bio_list_on_stack[0] by 
q->make_request_fn(q, bio); After handling those bios it returns. 

There is no deadlock, right?
> 
>  Also, with your patch there can be lots of flush requests all being
>  sent at much the same time, which all do essentially the same
>  thing.  This is fairly wasteful.
> 
>  I came up with a different approach.  I restored the old because of
>  flush requests being serialized, but now if a caller will not bother
>  sending a flush if some other caller submitted a flush request while
>  it waited.  This should result in many fewer flush requests, and must
>  less delay.

It can avoid sending flush bios which arrive at the same time. The one which
get the lock can do the flush job. It's really a good idea. But if most flush
bios are sent to md device in sequential, the mddev->last_flush is usually before
start. In this case there is still lock contention. 
Is it better to use this patch with my method together? 
> 
>  Would you be able to run you same test on this version to confirm
>  that it fixes the performance issue, and if so review the code to see
>  if you are happy with it.

I'll do a test with this patch. But it's very a simple environment. Can you ask
customer do the test too? 

Regards
Xiao