On Sun, Mar 31, 2019 at 07:39:17PM -0700, Bart Van Assche wrote: > On 3/31/19 7:00 PM, Ming Lei wrote: > > On Sun, Mar 31, 2019 at 08:27:35AM -0700, Bart Van Assche wrote: > > > I'm not sure the approach of this patch series is really the direction we > > > should pursue. There are many block driver that free resources immediately > > > > Please see scsi_run_queue(), and the queue refcount is always held > > before run queue. > > That's not correct. There is no guarantee that q->q_usage_counter > 0 when > scsi_run_queue() is called from inside scsi_requeue_run_queue(). We don't need the guarantee of 'q->q_usage_counter > 0', I mean the queue's kobj reference counter. What we need is to allow run queue to work correctly after queue is frozen or cleaned up. > > > > I'd like to avoid having to modify all block drivers that free resources > > > immediately after blk_cleanup_queue() has returned. Have you considered to > > > modify blk_mq_run_hw_queues() such that it becomes safe to call that > > > function while blk_cleanup_queue() is in progress, e.g. by inserting a > > > percpu_ref_tryget_live(&q->q_usage_counter) / > > > percpu_ref_put(&q->q_usage_counter) pair? > > > > It can't work because blk_mq_run_hw_queues may happen after > > percpu_ref_exit() is done. > > > > However, if we move percpu_ref_exit() into queue's release handler, we > > don't need to grab q->q_usage_counter any more in blk_mq_run_hw_queues(), > > and we still have to free hw queue resources in queue's release handler, > > that is exactly what this patchset is doing. > > > > In short, getting q->q_usage_counter doesn't make a difference on this > > issue. > > percpu_ref_tryget_live() fails if a per-cpu counter is in the "dead" state. > percpu_ref_kill() changes the state of a per-cpu counter to the "dead" > state. blk_freeze_queue_start() calls percpu_ref_kill(). blk_cleanup_queue() > already calls blk_set_queue_dying() and that last function calls > blk_freeze_queue_start(). So I think that what you wrote is not correct and > that inserting a percpu_ref_tryget_live()/percpu_ref_put() pair in > blk_mq_run_hw_queues() or blk_mq_run_hw_queue() would make a difference and > also that moving the percpu_ref_exit() call into blk_release_queue() makes > sense. If percpu_ref_exit() is moved to blk_release_queue(), we still need to move freeing of hw queue's resource into blk_release_queue() like what the patchset is doing. Then we don't need to get/put q_usage_counter in blk_mq_run_hw_queues() any more, do we? Thanks, Ming