On Tue, 2018-12-04 at 22:17 +-0530, Kashyap Desai wrote: +AD4 +- Linux-scsi +AD4 +AD4 +AD4 +AD4 diff --git a/block/blk-mq.h b/block/blk-mq.h +AD4 +AD4 +AD4 index 9497b47..57432be 100644 +AD4 +AD4 +AD4 --- a/block/blk-mq.h +AD4 +AD4 +AD4 +-+-+- b/block/blk-mq.h +AD4 +AD4 +AD4 +AEAAQA -175,6 +-175,7 +AEAAQA static inline bool +AD4 +AD4 +AD4 blk+AF8-mq+AF8-get+AF8-dispatch+AF8-budget(struct blk+AF8-mq+AF8-hw+AF8-ctx +ACo-hctx) +AD4 +AD4 +AD4 static inline void +AF8AXw-blk+AF8-mq+AF8-put+AF8-driver+AF8-tag(struct blk+AF8-mq+AF8-hw+AF8-ctx +ACo-hctx, +AD4 +AD4 +AD4 struct request +ACo-rq) +AD4 +AD4 +AD4 +AHs +AD4 +AD4 +AD4 +- hctx-+AD4-tags-+AD4-rqs+AFs-rq-+AD4-tag+AF0 +AD0 NULL+ADs +AD4 +AD4 +AD4 blk+AF8-mq+AF8-put+AF8-tag(hctx, hctx-+AD4-tags, rq-+AD4-mq+AF8-ctx, rq-+AD4-tag)+ADs +AD4 +AD4 +AD4 rq-+AD4-tag +AD0 -1+ADs +AD4 +AD4 +AD4 +AD4 No SCSI driver should call scsi+AF8-host+AF8-find+AF8-tag() after a request has +AD4 +AD4 finished. The above patch introduces yet another race and hence can't be +AD4 +AD4 a proper fix. +AD4 +AD4 Bart, many scsi drivers use scsi+AF8-host+AF8-find+AF8-tag() to traverse max tag+AF8-id to +AD4 find out pending IO in firmware. +AD4 One of the use case is - HBA firmware recovery. In case of firmware +AD4 recovery, driver may require to traverse the list and return back pending +AD4 scsi command to SML for retry. +AD4 I quickly grep the scsi code and found that snic+AF8-scsi, qla4xxx, fnic, +AD4 mpt3sas are using API scsi+AF8-host+AF8-find+AF8-tag for the same purpose. +AD4 +AD4 Without this patch, we hit very basic kernel panic due to page fault. This +AD4 is not an issue in non-mq code path. Non-mq path use +AD4 blk+AF8-map+AF8-queue+AF8-find+AF8-tag() and that particular API does not provide stale +AD4 requests. As I wrote before, your patch doesn't fix the race you described but only makes the race window smaller. If you want an example of how to use scsi+AF8-host+AF8-find+AF8-tag() properly, have a look at the SRP initiator driver (drivers/infiniband/ulp/srp). That driver uses scsi+AF8-host+AF8-find+AF8-tag() without triggering any NULL pointer dereferences. The approach used in that driver also works when having to support HBA firmware recovery. Bart.