On 01/26/2017 04:14 PM, Bart Van Assche wrote: > On Thu, 2017-01-26 at 14:51 -0700, Jens Axboe wrote: >> That is exactly what it means, looks like that one path doesn't handle >> that. You'd have to exhaust the pool with atomic allocs for this to >> trigger, we don't do that at all in the normal IO path. So good catch, >> must be the dm part that enables this since it does NOWAIT allocations. >> >> >> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c >> index 3136696f4991..c27613de80c5 100644 >> --- a/block/blk-mq-sched.c >> +++ b/block/blk-mq-sched.c >> @@ -134,7 +134,8 @@ struct request *blk_mq_sched_get_request(struct request_queue *q, >> rq = __blk_mq_alloc_request(data, op); >> } else { >> rq = __blk_mq_alloc_request(data, op); >> - data->hctx->tags->rqs[rq->tag] = rq; >> + if (rq) >> + data->hctx->tags->rqs[rq->tag] = rq; >> } >> >> if (rq) { > > Hello Jens, > > With these two patches applied the scheduling-while-atomic complaint and > the oops are gone. However, some tasks get stuck. Is the console output > below enough to figure out what is going on or do you want me to bisect > this? I don't think that any requests got stuck since no pending requests > are shown in /sys/block/*/mq/*/{pending,*/rq_list}. What device is stuck? Is it running with an mq scheduler attached, or with "none"? Would also be great to see the output of /sys/block/*/mq/*/tags and sched_tags so we can see if they have anything pending. >From a quick look at the below, it looks like a request leak. Bisection would most likely be very helpful. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html