> Il giorno 09 feb 2018, alle ore 20:18, Jens Axboe <axboe@xxxxxxxxx> ha scritto: > > On 2/9/18 12:14 PM, Bart Van Assche wrote: >> On 02/09/18 10:58, Jens Axboe wrote: >>> On 2/9/18 11:54 AM, Bart Van Assche wrote: >>>> Hello Paolo, >>>> >>>> If I enable the BFQ scheduler for a dm-mpath device then a kernel oops >>>> appears (see also below). This happens systematically with Linus' tree from >>>> this morning (commit 54ce685cae30) merged with Jens' for-linus branch (commit >>>> a78773906147 ("block, bfq: add requeue-request hook")) and for-next branch >>>> (commit 88455ad7f928). Is this a known issue? >>> >>> Does it happen on Linus -git as well, or just with my for-linus merged in? >>> What I'm getting at is if a78773906147 caused this or not. >> >> Hello Jens, >> >> Thanks for chiming in. After having reverted commit a78773906147, after >> having rebuilt the BFQ scheduler, after having rebooted and after having >> repeated the test I see the same kernel oops being reported. I think >> that means that this regression is not caused by commit a78773906147. In >> case it would be useful, here is how gdb translates the crash address: >> >> $ gdb block/bfq*ko >> (gdb) list *(bfq_remove_request+0x8d) >> 0x280d is in bfq_remove_request (block/bfq-iosched.c:1760). >> 1755 list_del_init(&rq->queuelist); >> 1756 bfqq->queued[sync]--; >> 1757 bfqd->queued--; >> 1758 elv_rb_del(&bfqq->sort_list, rq); >> 1759 >> 1760 elv_rqhash_del(q, rq); >> 1761 if (q->last_merge == rq) >> 1762 q->last_merge = NULL; >> 1763 >> 1764 if (RB_EMPTY_ROOT(&bfqq->sort_list)) { > > Looks very odd. So clearly RQF_HASHED is set, but we're blowing up on > the hash list pointers. I'll let Paolo take a look at this one. Thanks > for testing without that commit, I want to push out my pending fixes > today and this would have thrown a wrench in the works. > Also this smells a little bit like some spurious elevator call. Unfortunately I have no clue on the cause. To go on, I need at least to reproduce it. In this respect: Bart, could you please tell me how to setup the offending configuration, and to cause the failure? Possibly with just one, or at most two PCs. I don't have fancier hw at the moment. Thanks, Paolo > -- > Jens Axboe