RE: [PATCH RFC v7 10/12] megaraid_sas: switch fusion adapters to MQ

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Thu, Aug 06, 2020 at 08:07:38PM +0530, Kashyap Desai wrote:
> > > > Hi Ming -
> > > >
> > > > There is still some race which is not handled.  Take a case of IO
> > > > is not able to get budget and it has already marked <restarts>
flag.
> > > > <restarts> flag will be seen non-zero in completion path and
> > > > completion path will attempt h/w queue run. (But this particular
> > > > IO is still not in s/w queue.).
> > > > Attempt of running h/w queue from completion path will not flush
> > > > any IO since there is no IO in s/w queue.
> > >
> > > Then where is the IO to be submitted in case of running out of
budget?
> >
> > Typical race in your latest patch is - (Lets consider command A,B and
> > C) Command A did not receive budget. Command B completed  (which was
> > already
>
> Command A doesn't get budget, and A is still in sw/scheduler queue
because
> we try to acquire budget before dequeuing request from sw/scheduler
queue,
> see __blk_mq_do_dispatch_sched() and blk_mq_do_dispatch_ctx().
>
> Not consider direct issue, because the hw queue will be run explicitly
when
> not getting budget, see __blk_mq_try_issue_directly.
>
> Not consider command A being added to hctx->dispatch too, because blk-mq
> will re-run the queue, see blk_mq_dispatch_rq_list().

Ming -

After going through your comment (I noted your comment and thanks for
correcting my understanding.) and block layer code, I realize that it is a
different race condition. My previous explanation was not accurate.
I debug further and figure out what is actually happening - Consider below
scenario/sequence -

Thread -1 - Detected budget contention. Set restarts = 1.
Thread -2 - old restarts = 1. start hw queue.
Thread -3 - old restarts = 1. start hw queue.
Thread -2 - move restarts = 0.
In my testing, I noticed that both thread-2 and thread-3 started h/w queue
but there was no work for them to do. It is possible because some other
context of h/w queue run might have done that job.
It means, IO of thread-1 is already posted.
Thread -4 - Detected budget contention. Set restart = 1 (because thread-2
has move restarts=0).
Thread -3 - move restarts = 0 (because this thread see old value = 1 but
that is actually updated one more time by thread-4 and theread-4 actually
wanted to run h/w queues). IO of Thread-4 will not be scheduled.

We have to make sure that completion IO path do atomic_cmpxchng of
restarts flag before running the h/w queue.  Below code change - (main fix
is sequence of atomic_cmpxchg and blk_mq_run_hw_queues) fix the issue.

--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -594,8 +594,27 @@ static bool scsi_end_request(struct request *req,
blk_status_t error,
        if (scsi_target(sdev)->single_lun ||
            !list_empty(&sdev->host->starved_list))
                kblockd_schedule_work(&sdev->requeue_work);
-       else
-               blk_mq_run_hw_queues(q, true);
+       else {
+               /*
+                * smp_mb() implied in either rq->end_io or
blk_mq_free_request
+                * is for ordering writing .device_busy in
scsi_device_unbusy()
+                * and reading sdev->restarts.
+                */
+               int old = atomic_read(&sdev->restarts);
+
+               if (old) {
+                       /*
+                        * ->restarts has to be kept as non-zero if there
is
+                        *  new budget contention comes.
+                        */
+                       atomic_cmpxchg(&sdev->restarts, old, 0);
+
+                       /* run the queue after restarts flag is updated
+                        * to avoid race condition with .get_budget
+                        */
+                       blk_mq_run_hw_queues(sdev->request_queue, true);
+               }
+       }

        percpu_ref_put(&q->q_usage_counter);
        return false;

Kashyap



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux