Re: [PATCH 1/8] nvme-rdma: quiesce/unquiesce admin_q instead of start/stop its hw queues

Sagi Grimberg <sagi@xxxxxxxxxxx> · Tue, 4 Jul 2017 12:07:38 +0300





@@ -791,7 +791,8 @@ static void nvme_rdma_error_recovery_work(struct 
work_struct *work)
       * queues are not a live anymore, so restart the queues to fail 
fast
       * new IO
       */
-    blk_mq_start_stopped_hw_queues(ctrl->ctrl.admin_q, true);
+    blk_mq_unquiesce_queue(ctrl->ctrl.admin_q);
+    blk_mq_kick_requeue_list(ctrl->ctrl.admin_q);

Now the queue won't be stopped via blk_mq_quiesce_queue(), so why do
you add blk_mq_kick_requeue_list() here?

I think you're right.

We now quiesce the queue and fast fail inflight io, in
nvme_complete_rq we call blk_mq_requeue_request with
!blk_mq_queue_stopped(req->q) which is now true.

So the requeue_work is triggered and requeue the request,
and when we unquiesce we simply run the hw queues again.

If we were to call it with !blk_queue_quiesced(req->q)
I think it would be needed though...

If you look at nvme_start_queues, it also kicks the requeue
work. I think that the proper fix for this is _keep_ the
requeue kick and in nvme_complete_rq call:

blk_mq_requeue_request(req, !blk_queue_quiesced(req->q));

Thoughts?