On 9/8/20 2:46 PM, Omar Sandoval wrote: > From: Omar Sandoval <osandov@xxxxxx> > > Yang Yang reported the following crash caused by requeueing a flush > request in Kyber: > > [ 2.517297] Unable to handle kernel paging request at virtual address ffffffd8071c0b00 > ... > [ 2.517468] pc : clear_bit+0x18/0x2c > [ 2.517502] lr : sbitmap_queue_clear+0x40/0x228 > [ 2.517503] sp : ffffff800832bc60 pstate : 00c00145 > ... > [ 2.517599] Process ksoftirqd/5 (pid: 51, stack limit = 0xffffff8008328000) > [ 2.517602] Call trace: > [ 2.517606] clear_bit+0x18/0x2c > [ 2.517619] kyber_finish_request+0x74/0x80 > [ 2.517627] blk_mq_requeue_request+0x3c/0xc0 > [ 2.517637] __scsi_queue_insert+0x11c/0x148 > [ 2.517640] scsi_softirq_done+0x114/0x130 > [ 2.517643] blk_done_softirq+0x7c/0xb0 > [ 2.517651] __do_softirq+0x208/0x3bc > [ 2.517657] run_ksoftirqd+0x34/0x60 > [ 2.517663] smpboot_thread_fn+0x1c4/0x2c0 > [ 2.517667] kthread+0x110/0x120 > [ 2.517669] ret_from_fork+0x10/0x18 > > This happens because Kyber doesn't track flush requests, so > kyber_finish_request() reads a garbage domain token. Only call the > scheduler's requeue_request() hook if RQF_ELVPRIV is set (like we do for > the finish_request() hook in blk_mq_free_request()). Now that we're > handling it in blk-mq, also remove the check from BFQ. Thanks, applied. -- Jens Axboe