Re: [PATCHv3] md/raid1: Avoid lock contention from wake_up()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 6, 2023 at 8:14 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
>
> 在 2023/07/05 19:32, Jack Wang 写道:
> > wake_up is called unconditionally in a few paths such as make_request(),
> > which cause lock contention under high concurrency workload like below
> >      raid1_end_write_request
> >       wake_up
> >        __wake_up_common_lock
> >         spin_lock_irqsave
> >
> > Improve performance by only call wake_up() if waitqueue is not empty
> >
> LGTM
>
> Reviewed-by: Yu Kuai <yukuai3@xxxxxxxxxx>
Kuai,
Thank you very much for the review and suggestions!

>
> > Fio test script:
> >
> > [global]
> > name=random reads and writes
> > ioengine=libaio
> > direct=1
> > readwrite=randrw
> > rwmixread=70
> > iodepth=64
> > buffered=0
> > filename=/dev/md0
> > size=1G
> > runtime=30
> > time_based
> > randrepeat=0
> > norandommap
> > refill_buffers
> > ramp_time=10
> > bs=4k
> > numjobs=400
> > group_reporting=1
> > [job1]
> >
> > Test result with 2 ramdisk in raid1 on a Intel Broadwell 56 cores server.
> >
> >       Before this patch       With this patch
> >       READ    BW=4621MB/s     BW=7337MB/s
> >       WRITE   BW=1980MB/s     BW=3144MB/s
> >
> > The patch is inspired by Yu Kuai's change for raid10:
> > https://lore.kernel.org/r/20230621105728.1268542-1-yukuai1@xxxxxxxxxxxxxxx
> >
> > Cc: Yu Kuai <yukuai3@xxxxxxxxxx>
> > Signed-off-by: Jack Wang <jinpu.wang@xxxxxxxxx>
> > ---
> > v3: rephrase the commit message, no code change.
> >
> > v2: addressed comments from Kuai
> > * Removed newline
> > * change the missing case in raid1_write_request
> > * I still kept the change for _wait_barrier and wait_read_barrier, as I did
> >   performance tests without them there are still lock contention from
> >   __wake_up_common_lock
> >
> >   drivers/md/raid1.c | 18 ++++++++++++------
> >   1 file changed, 12 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> > index f834d99a36f6..0c76c36d8cb1 100644
> > --- a/drivers/md/raid1.c
> > +++ b/drivers/md/raid1.c
> > @@ -789,11 +789,17 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect
> >       return best_disk;
> >   }
> >
> > +static void wake_up_barrier(struct r1conf *conf)
> > +{
> > +     if (wq_has_sleeper(&conf->wait_barrier))
> > +             wake_up(&conf->wait_barrier);
> > +}
> > +
> >   static void flush_bio_list(struct r1conf *conf, struct bio *bio)
> >   {
> >       /* flush any pending bitmap writes to disk before proceeding w/ I/O */
> >       raid1_prepare_flush_writes(conf->mddev->bitmap);
> > -     wake_up(&conf->wait_barrier);
> > +     wake_up_barrier(conf);
> >
> >       while (bio) { /* submit pending writes */
> >               struct bio *next = bio->bi_next;
> > @@ -970,7 +976,7 @@ static bool _wait_barrier(struct r1conf *conf, int idx, bool nowait)
> >        * In case freeze_array() is waiting for
> >        * get_unqueued_pending() == extra
> >        */
> > -     wake_up(&conf->wait_barrier);
> > +     wake_up_barrier(conf);
> >       /* Wait for the barrier in same barrier unit bucket to drop. */
> >
> >       /* Return false when nowait flag is set */
> > @@ -1013,7 +1019,7 @@ static bool wait_read_barrier(struct r1conf *conf, sector_t sector_nr, bool nowa
> >        * In case freeze_array() is waiting for
> >        * get_unqueued_pending() == extra
> >        */
> > -     wake_up(&conf->wait_barrier);
> > +     wake_up_barrier(conf);
> >       /* Wait for array to be unfrozen */
> >
> >       /* Return false when nowait flag is set */
> > @@ -1042,7 +1048,7 @@ static bool wait_barrier(struct r1conf *conf, sector_t sector_nr, bool nowait)
> >   static void _allow_barrier(struct r1conf *conf, int idx)
> >   {
> >       atomic_dec(&conf->nr_pending[idx]);
> > -     wake_up(&conf->wait_barrier);
> > +     wake_up_barrier(conf);
> >   }
> >
> >   static void allow_barrier(struct r1conf *conf, sector_t sector_nr)
> > @@ -1171,7 +1177,7 @@ static void raid1_unplug(struct blk_plug_cb *cb, bool from_schedule)
> >               spin_lock_irq(&conf->device_lock);
> >               bio_list_merge(&conf->pending_bio_list, &plug->pending);
> >               spin_unlock_irq(&conf->device_lock);
> > -             wake_up(&conf->wait_barrier);
> > +             wake_up_barrier(conf);
> >               md_wakeup_thread(mddev->thread);
> >               kfree(plug);
> >               return;
> > @@ -1574,7 +1580,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
> >       r1_bio_write_done(r1_bio);
> >
> >       /* In case raid1d snuck in to freeze_array */
> > -     wake_up(&conf->wait_barrier);
> > +     wake_up_barrier(conf);
> >   }
> >
> >   static bool raid1_make_request(struct mddev *mddev, struct bio *bio)
> >
>




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux