On 09/19/2014 08:18 AM, Ming Lei wrote: > On Fri, Sep 19, 2014 at 9:07 PM, Ming Lei <ming.lei@xxxxxxxxxxxxx> wrote: >> On Fri, Sep 19, 2014 at 1:03 AM, Jens Axboe <axboe@xxxxxx> wrote: >>> On 2014-09-18 10:35, Christoph Hellwig wrote: >>>> >>>> On Thu, Sep 18, 2014 at 11:59:10PM +0800, Ming Lei wrote: >>>>> >>>>> If there are two requests or more timed out, the dispatch queue >>>>> is put into stopped state and never be recoverd, and there >>>>> is no such problem in non-mq mode. >>>>> >>>>> This patch trys to recover the stopped queue when the queue >>>>> becomes unbusy, then the following retries can move on. >>>>> >>>>> Basically this patch maintains same behavior for this situation >>>>> with non-mq mode. >>>> >>>> >>>> This looks somewhat similar to the issues that Doug reported, and I >>>> remember >>>> when he was last running into boot problems it was timeout related, too. >>>> >>>> As far as the implementation is concerned I think the correct fix is >>>> to clear the BLK_MQ_S_STOPPED queue flags in blk_mq_kick_requeue_list. >>> >>> >>> Since that's the kick part of the requeue, auto-starting the queue for that >>> makes a lot of sense. I say that's the way we go. >> >> Yeah, that looks better. >> >> But it doesn't work after the simple change, and I need to >> investigate further. > > It is because of the timer miss, now it starts to work. Excellent. I think most new issues should be fixed in for-linus for inclusion in this round. It's much bigger than I hoped for this late in the cycle, but lots of us have run a lot of testing, so that's not a huge worry. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html