Re: [PATCH] nvme: fix hang in remove path

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ming,

On Fri, Jun 02, 2017 at 04:32:08PM +0800, Ming Lei wrote:
> We need to start admin queues too in nvme_kill_queues()
> for avoiding hang in remove path[1].
> 
> This patch is very similar with 806f026f9b901eaf(nvme: use
> blk_mq_start_hw_queues() in nvme_kill_queues()).

It would make sense to still add:

if (ctrl->state == NVME_CTRL_DELETING || ctrl->state == NVME_CTRL_DEAD)
	return

inside nvme_configure_apst at the top irrespective of this change.

It would make sure that no further instructions are executed
(possibily no commands of any sort are sent ever) if we already know
that there is no point sending latency tolerance update to controller
if it is already dead or in process of deletion.

> 
> [1] hang stack trace
> [<ffffffff813c9716>] blk_execute_rq+0x56/0x80
> [<ffffffff815cb6e9>] __nvme_submit_sync_cmd+0x89/0xf0
> [<ffffffff815ce7be>] nvme_set_features+0x5e/0x90
> [<ffffffff815ce9f6>] nvme_configure_apst+0x166/0x200
> [<ffffffff815cef45>] nvme_set_latency_tolerance+0x35/0x50
> [<ffffffff8157bd11>] apply_constraint+0xb1/0xc0
> [<ffffffff8157cbb4>] dev_pm_qos_constraints_destroy+0xf4/0x1f0
> [<ffffffff8157b44a>] dpm_sysfs_remove+0x2a/0x60
> [<ffffffff8156d951>] device_del+0x101/0x320
> [<ffffffff8156db8a>] device_unregister+0x1a/0x60
> [<ffffffff8156dc4c>] device_destroy+0x3c/0x50
> [<ffffffff815cd295>] nvme_uninit_ctrl+0x45/0xa0
> [<ffffffff815d4858>] nvme_remove+0x78/0x110
> [<ffffffff81452b69>] pci_device_remove+0x39/0xb0
> [<ffffffff81572935>] device_release_driver_internal+0x155/0x210
> [<ffffffff81572a02>] device_release_driver+0x12/0x20
> [<ffffffff815d36fb>] nvme_remove_dead_ctrl_work+0x6b/0x70
> [<ffffffff810bf3bc>] process_one_work+0x18c/0x3a0
> [<ffffffff810bf61e>] worker_thread+0x4e/0x3b0
> [<ffffffff810c5ac9>] kthread+0x109/0x140
> [<ffffffff8185800c>] ret_from_fork+0x2c/0x40
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> Fixes: c5552fde102fc("nvme: Enable autonomous power state transitions")
> Reported-by: Rakesh Pandit <rakesh@xxxxxxxxxx>
> Tested-by: Rakesh Pandit <rakesh@xxxxxxxxxx>
> Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
> ---
>  drivers/nvme/host/core.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index a60926410438..0f9cc0c55e15 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2438,6 +2438,10 @@ void nvme_kill_queues(struct nvme_ctrl *ctrl)
>  	struct nvme_ns *ns;
>  
>  	mutex_lock(&ctrl->namespaces_mutex);
> +
> +	/* Forcibly start all queues to avoid having stuck requests */
> +	blk_mq_start_hw_queues(ctrl->admin_q);
> +
>  	list_for_each_entry(ns, &ctrl->namespaces, list) {
>  		/*
>  		 * Revalidating a dead namespace sets capacity to 0. This will
> -- 
> 2.9.4
> 



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux