On Wed, 2017-04-12 at 16:41 +0200, Benjamin Block wrote: > On Mon, Apr 10, 2017 at 10:54:01AM -0700, Bart Van Assche wrote: > > [ ... ] > OK, so I take it the problem is when the queue is stopped, then the > completion in blk_execute_rq() will never be triggered and then we wait > for a timeout there, or potentially forever? Hello Benjamin, Thanks for the review. If a request is queued after a queue has been stopped then that request will never be started and hence even the timeout timer won't be started. blk_cleanup_queue() hangs if invoked for a stopped queue and one or more requests have not yet been started. > But then what is the point in trying to do it async here anyway? Won't > that just be doomed in the same way, just that we don't see the effect? Have you noticed that patch 4/4 in this series restarts the queue just before calling blk_cleanup_queue()? Anyway, can you have a look at the patch below and see whether this new version addresses all the concerns you had reported in your previous e-mail? Thanks, Bart. Subject: [PATCH] sd: Make synchronize cache upon shutdown asynchronous This patch avoids that sd_shutdown() hangs on the SYNCHRONIZE CACHE command if the block layer queue has been stopped by scsi_target_block(). --- drivers/scsi/sd.c | 45 ++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 40 insertions(+), 5 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index fe0f7997074e..deff564fe649 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -1489,6 +1489,33 @@ static unsigned int sd_check_events(struct gendisk *disk, unsigned int clearing) return retval; } +static void sd_sync_cache_done(struct request *rq, int e) +{ + struct scsi_disk *sdkp = scsi_disk(rq->rq_disk); + + sd_printk(KERN_DEBUG, sdkp, "%s\n", __func__); + + blk_put_request(rq); +} + +/* + * Issue a SYNCHRONIZE CACHE command asynchronously. Since blk_cleanup_queue() + * waits for all commands to finish, __scsi_remove_device() will wait for the + * SYNCHRONIZE CACHE command to finish. + */ +static int sd_sync_cache_async(struct scsi_disk *sdkp) +{ + const struct scsi_device *sdp = sdkp->device; + const int timeout = sdp->request_queue->rq_timeout * + SD_FLUSH_TIMEOUT_MULTIPLIER; + const unsigned char cmd[10] = { SYNCHRONIZE_CACHE }; + + sd_printk(KERN_DEBUG, sdkp, "%s\n", __func__); + return scsi_execute_async(sdp, sdkp->disk, cmd, DMA_NONE, NULL, 0, + timeout, SD_MAX_RETRIES, 0, 0, + sd_sync_cache_done); +} + static int sd_sync_cache(struct scsi_disk *sdkp) { int retries, res; @@ -3349,13 +3376,15 @@ static int sd_start_stop_device(struct scsi_disk *sdkp, int start) } /* - * Send a SYNCHRONIZE CACHE instruction down to the device through - * the normal SCSI command structure. Wait for the command to - * complete. + * Send a SYNCHRONIZE CACHE instruction down to the device through the normal + * SCSI command structure. When stopping the disk, wait for the command to + * complete. When not stopping the disk, the blk_cleanup_queue() call in + * __scsi_remove_device() will wait for this command to complete. */ static void sd_shutdown(struct device *dev) { struct scsi_disk *sdkp = dev_get_drvdata(dev); + bool stop_disk; if (!sdkp) return; /* this can happen */ @@ -3363,12 +3392,18 @@ static void sd_shutdown(struct device *dev) if (pm_runtime_suspended(dev)) return; + stop_disk = system_state != SYSTEM_RESTART && + sdkp->device->manage_start_stop; + if (sdkp->WCE && sdkp->media_present) { sd_printk(KERN_NOTICE, sdkp, "Synchronizing SCSI cache\n"); - sd_sync_cache(sdkp); + if (stop_disk) + sd_sync_cache(sdkp); + else + sd_sync_cache_async(sdkp); } - if (system_state != SYSTEM_RESTART && sdkp->device->manage_start_stop) { + if (stop_disk) { sd_printk(KERN_NOTICE, sdkp, "Stopping disk\n"); sd_start_stop_device(sdkp, 0); } -- 2.12.2