On Jun 23, 2022 / 06:05, Jens Axboe wrote: > On 6/23/22 2:56 AM, Shinichiro Kawasaki wrote: > > On Jun 21, 2022 / 10:07, Jens Axboe wrote: > >> If rq_qos_throttle() ends up blocking, then we will have invalidated and > >> flushed our current plug. Since blk_mq_get_cached_request() hasn't > >> popped the cached request off the plug list just yet, we end holding a > >> pointer to a request that is no longer valid. This insta-crashes with > >> rq->mq_hctx being NULL in the validity checks just after. > >> > >> Pop the request off the cached list before doing rq_qos_throttle() to > >> avoid using a potentially stale request. > >> > >> Fixes: 0a5aa8d161d1 ("block: fix blk_mq_attempt_bio_merge and rq_qos_throttle protection") > >> Reported-by: Dylan Yudaken <dylany@xxxxxx> > >> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> > > > > Thank you for the fix and sorry for the trouble. The patch passed my test set: > > > > Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@xxxxxxx> > > > > > > BTW, may I know the workload or script which triggers the failure? I'm > > interested in if we can add a blktests test case to exercises qos throttle > > and prevent similar bugs in the future. > > Not sure if there are others, but specifically blk-iocost will do an > explicit schedule() for some conditions. See the bottom of > ioc_rqos_throttle(). Thanks. Will try to create a script which triggers the schedule() in ioc_rqos_throttle(). -- Shin'ichiro Kawasaki