Re: REQ_HIPRI and SCSI mid-level

Ming Lei <ming.lei@xxxxxxxxxx> · Wed, 26 May 2021 08:34:12 +0800

On Fri, May 21, 2021 at 05:56:19PM -0400, Douglas Gilbert wrote:
> The REQ_HIPRI flag on requests is associated with blk_poll() (aka iopoll)
> and assumes the user space (or some higher level) will be calling
> blk_poll() on requests marked with REQ_HIPRI and that will lead to their
> completion.
> 
> In lk 5.13-rc1 the megaraid and scsi_debug LLDs support blk_poll() [seen
> by searching for 'mq_poll'] with more to follow, I assume. I have tested
> blk_poll() on the scsi_debug driver using both fio and the new sg driver.
> It works well with one caveat: as long as there isn't an error.
> After fighting with that error processing from the ULD side (i.e. the
> new sg driver) and the LLD side I am concluding that the glue that
> holds them together, that is, the mid-level is not as REQ_HIPRI aware
> as it should be.
> 
> Yes REQ_HIPRI is there in scsi_lib.c but it is missing from scsi_error.c
> How can scsi_error.c re-issue requests _without_ taking into account
> that the original was issued with REQ_HIPRI ? Well I don't know but I'm
> pretty sure that is close to the area that I see causing problems
> (mainly lockups).
> 
> As an example the scsi_debug driver has an in-use bitmap that when a new
> request arrives the code looks for an empty slot. Due to (incorrect)
> parameter setup that may fail. If the driver returns:
>     device_qfull_result = (DID_OK << 16) | SAM_STAT_TASK_SET_FULL;
> then I see lock-ups if the request in question has REQ_HIPRI set.
> 
> If that is changed to:
>     device_qfull_result = (DID_ABORT << 16) | SAM_STAT_TASK_SET_FULL;
> then my user space test program sees that error and aborts showing the
> TASK SET FULL SCSI status. That is much better than a lockup ...
> 
> Having played around with variants of the above for a few weeks, I'd
> like to throw this problem into the open :-)
> 
> 
> Suggestion: perhaps the eh could give up immediately on any request
> with REQ_HIPRI set (i.e. make it a higher level layer's problem).

One invariant is that the polling will be kept as running until the
associated iocb/bio is completed. So I understand it should be fine
for timeout handler /EH to ignore REQ_HIPRI. That said the associated
iocb/bio will be reaped by upper layer if EH/timeout handler makes
progress. Or can you explain the scsi-debug REQ_HIPRI lockup in a bit
detail?

Thanks,
Ming