Re: [PATCH V2 0/5] md/raid10: Improve handling raid10 discard request

Song Liu <song@xxxxxxxxxx> · Thu, 4 Feb 2021 00:12:15 -0800

On Wed, Feb 3, 2021 at 11:42 PM Xiao Ni <xni@xxxxxxxxxx> wrote:
>
> Hi Song
>
> Please ignore the v2 version. There is a place that needs to be fix.
> I'll re-send
> v2 version again.

What did you change in the new v2? Removing "extern" in the header?
For small changes like this, I can just update it while applying the patches.
If we do need resend (for bigger changes), it's better to call it v3.

I was testing the first v2 in the past hour or so, it looks good in the test.
I will take a closer look tomorrow. On the other hand, we are getting close
to the 5.12 merge window, so it is a little too late for bigger
changes like this.
Therefore, I would prefer to wait until 5.13. If you need it in 5.12 for some
reason, please let me know.

Thanks,
Song

>
> Regards
> Xiao
>
> On 02/04/2021 01:57 PM, Xiao Ni wrote:
> > Hi all
> >
> > Now mkfs on raid10 which is combined with ssd/nvme disks takes a long time.
> > This patch set tries to resolve this problem.
> >
> > This patch set had been reverted because of a data corruption problem. This
> > version fix this problem. The root cause which causes the data corruption is
> > the wrong calculation of start address of near copies disks.
> >
> > Now we use a similar way with raid0 to handle discard request for raid10.
> > Because the discard region is very big, we can calculate the start/end
> > address for each disk. Then we can submit the discard request to each disk.
> > But for raid10, it has copies. For near layout, if the discard request
> > doesn't align with chunk size, we calculate a start_disk_offset. Now we only
> > use start_disk_offset for the first disk, but it should be used for the
> > near copies disks too.
> >
> > [  789.709501] discard bio start : 70968, size : 191176
> > [  789.709507] first stripe index 69, start disk index 0, start disk offset 70968
> > [  789.709509] last stripe index 256, end disk index 0, end disk offset 262144
> > [  789.709511] disk 0, dev start : 70968, dev end : 262144
> > [  789.709515] disk 1, dev start : 70656, dev end : 262144
> >
> > For example, in this test case, it has 2 near copies. The start_disk_offset
> > for the first disk is 70968. It should use the same offset address for second disk.
> > But it uses the start address of this chunk. It discard more region. This version
> > simply spilt the un-aligned part with strip size.
> >
> > And it fixes another problem. The calculation of stripe_size is wrong in reverted version.
> >
> > V2: Fix problems pointed by Christoph Hellwig.
> >
> > Xiao Ni (5):
> >    md: add md_submit_discard_bio() for submitting discard bio
> >    md/raid10: extend r10bio devs to raid disks
> >    md/raid10: pull the code that wait for blocked dev into one function
> >    md/raid10: improve raid10 discard request
> >    md/raid10: improve discard request for far layout
> >
> >   drivers/md/md.c     |  20 +++
> >   drivers/md/md.h     |   2 +
> >   drivers/md/raid0.c  |  14 +-
> >   drivers/md/raid10.c | 434 +++++++++++++++++++++++++++++++++++++++++++++-------
> >   drivers/md/raid10.h |   1 +
> >   5 files changed, 402 insertions(+), 69 deletions(-)
> >
>