On Mon, Apr 01, 2019 at 05:19:01PM +0800, jianchao.wang wrote: > Hi Ming > > On 4/1/19 11:28 AM, Ming Lei wrote: > > On Mon, Apr 01, 2019 at 11:25:50AM +0800, jianchao.wang wrote: > >> Hi Ming > >> > >> On 4/1/19 10:52 AM, Ming Lei wrote: > >>>> percpu_ref_tryget_live() fails if a per-cpu counter is in the "dead" state. > >>>> percpu_ref_kill() changes the state of a per-cpu counter to the "dead" > >>>> state. blk_freeze_queue_start() calls percpu_ref_kill(). blk_cleanup_queue() > >>>> already calls blk_set_queue_dying() and that last function calls > >>>> blk_freeze_queue_start(). So I think that what you wrote is not correct and > >>>> that inserting a percpu_ref_tryget_live()/percpu_ref_put() pair in > >>>> blk_mq_run_hw_queues() or blk_mq_run_hw_queue() would make a difference and > >>>> also that moving the percpu_ref_exit() call into blk_release_queue() makes > >>>> sense. > >>> If percpu_ref_exit() is moved to blk_release_queue(), we still need to > >>> move freeing of hw queue's resource into blk_release_queue() like what > >>> the patchset is doing. > >>> > >>> Then we don't need to get/put q_usage_counter in blk_mq_run_hw_queues() any more, > >>> do we? > >> > >> IMO, if we could get a way to prevent any attempt to run queue, it would be > >> better and clearer. > > > > It is hard to do that way, and not necessary. > > > > I will post V2 soon for review. > > > > Put percpu_ref_tryget/put pair into blk_mq_run_hw_queues could stop run queue after > requet_queue is frozen and drained (run queue is also unnecessary because there is no > entered requests). And also percpu_ref_tryget could avoid the io hung issue you mentioned. > We have similar one in blk_mq_timeout_work. If percpu_ref_tryget() is used, percpu_ref_exit() has to be moved into queue's release handler. Then we still have to move freeing hctx's resource into hctx or queue's release handler, that is exactly what this patch is doing. Then percpu_ref_tryget() becomes unnecessary again, right? > > freeze and drain queue to stop new attempt to run queue, blk_sync_queue syncs and stops > the started ones, then hctx->run_queue is cleaned totally. > > IMO, it would be better to have a checkpoint after which there will be no any in-flight > asynchronous activities of the request_queue (hctx->run_work, q->requeue_work, q-> timeout_work) > and any attempt to start them will fail. All are canceled in blk_cleanup_queue(), but not enough, given queue can be run in sync mode(such as via plug, direct issue, ...), or driver's requeue, such as SCSI's requeue. SCSI's requeue may run other LUN's queue just by holding queue's kobject refcount. > > Perhaps, this will be a good change to do this ;) However, I don't see it is necessary if we simply move freeing hctx's resource into its release handler, just like V2. Thanks, Ming