Re: requeue failure with blk-mq-sched

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/19/2017 03:09 PM, Jens Axboe wrote:
> On 01/19/2017 04:27 AM, Hannes Reinecke wrote:
>> Hi Jens,
>>
>> upon further testing with your blk-mq-sched branch I hit a queue stall
>> during requeing:
>>
>> [  202.340959] sd 3:0:4:1: tag#473 Send: scmd 0xffff880422e7a440
>> [  202.340962] sd 3:0:4:1: tag#473 CDB: Test Unit Ready 00 00 00 00 00 00
>> [  202.341161] sd 3:0:4:1: tag#473 Done: ADD_TO_MLQUEUE Result:
>> hostbyte=DID_OK driverbyte=DRIVER_OK
>> [  202.341164] sd 3:0:4:1: tag#473 CDB: Test Unit Ready 00 00 00 00 00 00
>> [  202.341167] sd 3:0:4:1: tag#473 Sense Key : Unit Attention [current]
>> [  202.341171] sd 3:0:4:1: tag#473 Add. Sense: Power on, reset, or bus
>> device reset occurred
>> [  202.341173] sd 3:0:4:1: tag#473 scsi host busy 1 failed 0
>> [  202.341176] sd 3:0:4:1: tag#473 Inserting command ffff880422e7a440
>> into mlqueue
>>
>> ... and that is the last ever heard of that device.
>> The 'device_busy' count remains at '1' and no further commands will be
>> sent to the device.
> 
> If device_busy is at 1, then it should have a command pending. Where did
> you log this - it would be bandy if you attached whatever debug patch
> you put in, so we can see where the printks are coming from. If we get a
> BUSY with nothing pending, the driver should be ensuring that the queue
> gets run again later through blk_mq_delay_queue(), for instance.
> 
> When the device is stuck, does it restart if you send it IO?
> 
Meanwhile I've found it.

Problem is that scsi_queue_rq() will not stop the queue when hitting a
busy condition before sending commands down to the driver, but still
calls blk_mq_delay_queue():

	switch (ret) {
	case BLK_MQ_RQ_QUEUE_BUSY:
		if (atomic_read(&sdev->device_busy) == 0 &&
		    !scsi_device_blocked(sdev))
			blk_mq_delay_queue(hctx, SCSI_QUEUE_DELAY);
		break;

As the queue isn't stopped blk_mq_delay_queue() won't do anything,
so queue_rq() will never be called.
I've send a patch to linux-scsi.

BTW: Is it a hard requirement that the queue has to be stopped when
returning BLK_MQ_RQ_QUEUE_BUSY?
The comments indicate as such, but none of the drivers do so...
Also, blk_mq_delay_queue() is a bit odd, in that it'll only start
stopped hardware queues. I would at least document that the queue has to
be stopped when calling that.
Better still, can't we have blk_mq_delay_queue start the queues
unconditionally?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@xxxxxxx			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux