On 01/24/2017 08:54 AM, Hannes Reinecke wrote: > Hi Jens, > > I'm trying to debug a queue stall with your blk-mq-sched branch; with my > latest mpt3sas patches fio stops basically directly after starting a > sequential read :-( > > I've debugged things and came up with the attached patch; we need to > restart waiters with blk_mq_tag_idle() after completing a tag. > We're already calling blk_mq_tag_busy() when fetching a tag, so I think > calling blk_mq_tag_idle() is required when retiring a tag. The patch isn't correct, the whole point of the un-idling is that it ISN'T happening for every request completion. Otherwise you throw away scalability. So a queue will go into active mode on the first request, and idle when it's been idle for a bit. The active count is used to divide up the tags. So I'm assuming we're missing a queue run somewhere when we fail getting a driver tag. The latter should only happen if the target has IO in flight already, and the restart marking should take care of it. Obviously there's a case where that is not true, since you are seeing stalls. > However, even with the attached patch I'm seeing some queue stalls; > looks like they're related to the 'stonewall' statement in fio. I think you are heading down the wrong path. Your patch will cause the symptoms to be a bit different, but you'll still run into cases where we fail giving out the tag and then stall. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html