Re: [PATCH V2 1/1] Add mddev->io_acct_cnt for raid0_quiesce

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 18, 2022 at 10:37 AM Song Liu <song@xxxxxxxxxx> wrote:
>
> On Thu, Nov 17, 2022 at 5:39 PM Xiao Ni <xni@xxxxxxxxxx> wrote:
> >
> > On Fri, Nov 18, 2022 at 3:57 AM Song Liu <song@xxxxxxxxxx> wrote:
> > >
> > > Hi Xiao,
> > >
> > > Thanks for the results.
> > >
> > > On Wed, Nov 16, 2022 at 6:03 PM Xiao Ni <xni@xxxxxxxxxx> wrote:
> > > >
> > > > Hi Song
> > > >
> > > > The performance is good.  Please check the result below.
> > > >
> > > > And for the patch itself, do you think we should add a smp_mb
> > > > diff --git a/drivers/md/md.c b/drivers/md/md.c
> > > > index 4d0139cae8b5..3696e3825e27 100644
> > > > --- a/drivers/md/md.c
> > > > +++ b/drivers/md/md.c
> > > > @@ -8650,9 +8650,11 @@ static void md_end_io_acct(struct bio *bio)
> > > >         bio_put(bio);
> > > >         bio_endio(orig_bio);
> > > >
> > > > -       if (atomic_dec_and_test(&mddev->io_acct_cnt))
> > > > +       if (atomic_dec_and_test(&mddev->io_acct_cnt)) {
> > > > +               smp_mb();
> > > >                 if (unlikely(test_bit(MD_QUIESCE, &mddev->flags)))
> > > >                         wake_up(&mddev->wait_io_acct);
> > > > +       }
> > > >  }
> > > >
> > > >  /*
> > > > diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> > > > index 9d4831ca802c..1818f79bfdf7 100644
> > > > --- a/drivers/md/raid0.c
> > > > +++ b/drivers/md/raid0.c
> > > > @@ -757,6 +757,7 @@ static void raid0_quiesce(struct mddev *mddev, int quiesce)
> > > >          * to member disks to avoid memory alloc and performance decrease
> > > >          */
> > > >         set_bit(MD_QUIESCE, &mddev->flags);
> > > > +       smp_mb();
> > > >         wait_event(mddev->wait_io_acct, !atomic_read(&mddev->io_acct_cnt));
> > > >         clear_bit(MD_QUIESCE, &mddev->flags);
> > > >  }
> > > >
> > > > Test result:
> > >
> > > I think there is some noise in the result?
> > >
> > > >
> > > >                           without patch    with patch
> > > > psync read          100MB/s           101MB/s         job:1 bs:4k
> > >
> > > For example, this is a small improvement, but
> > >
> > > >                            1015MB/s         1016MB/s       job:1 bs:128k
> > > >                            1359MB/s         1358MB/s       job:1 bs:256k
> > > >                            1394MB/s         1393MB/s       job:40 bs:4k
> > > >                            4959MB/s         4873MB/s       job:40 bs:128k
> > > >                            6166MB/s         6157MB/s       job:40 bs:256k
> > > >
> > > >                           without patch      with patch
> > > > psync write          286MB/s           275MB/s        job:1 bs:4k
> > >
> > > this is a big regression (~4%).
> > >
> > > >                             1810MB/s         1808MB/s      job:1 bs:128k
> > > >                             1814MB/s         1814MB/s      job:1 bs:256k
> > > >                             1802MB/s         1801MB/s      job:40 bs:4k
> > > >                             1814MB/s         1814MB/s      job:40 bs:128k
> > > >                             1814MB/s         1814MB/s      job:40 bs:256k
> > > >
> > > >                           without patch
> > > > psync randread    39.3MB/s           39.7MB/s      job:1 bs:4k
> > > >                              791MB/s            783MB/s       job:1 bs:128k
> > > >                             1183MiB/s          1217MB/s     job:1 bs:256k
> > > >                             1183MiB/s          1235MB/s     job:40 bs:4k
> > > >                             3768MB/s          3705MB/s     job:40 bs:128k
> > >
> > > And some regression for 128kB but improvement for 4kB.
> > >
> > > >                             4410MB/s           4418MB/s     job:40 bs:256k
> > >
> > > So I am not quite convinced by these results.
> >
> > Thanks for pointing out the problem. Maybe I need to do a precondition before
> > the testing.  I'll give the result again.
> > >
> > > Also, do we really need an extra counter here? Can we use
> > > mddev->active_io instead?
> >
> > At first, I thought of this way too. But active_io is decreased only
> > after pers->make_request.
> > So it can't be used to wait for all bios to return back.
>
> Ah, that's right.
>
> >
> > Can we decrease mddev->active_io in the bi_end_io of each personality?
> > Now only mddev_supsend
> > uses mddev->active_io. It needs to wait all active io to finish. From
> > this point, it should be better
> > to decrease acitve_io in bi_end_io. What's your opinion?
>
> I think we can give it a try. It may break some cases though. If mdadm tests
> behave the same with the change, we can give it a try.

What are the cases?

>
> OTOH, depend on the perf results, we may consider using percpu_ref for
> both active_io and io_acct_cnt.

Thanks. I'll have a try with it.

Regards
Xiao




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux