On 11/7/22 19:12, Hannes Reinecke wrote: > On 11/2/22 12:25, Damien Le Moal wrote: >> On 11/2/22 20:12, Hannes Reinecke wrote: >>> On 11/2/22 11:07, Damien Le Moal wrote: >>>> On 11/2/22 18:52, John Garry wrote: >>>>> Hi Damien, >>>>> >>> [ .. ] >> So we only need to find a way of 're-using' that tag, then we won't have >>> to set aside a reserved tag and everything would be dandy... >> >> I tried that. It is very ugly... Problem is that integration with EH in >> case a real NCQ error happens when all that read-log-complete dance is >> happening is hard. And don't get me started with the need to save/restore >> the scsi command context of the command we are reusing the tag from. >> >> And given that the code is changing to use regular submission path for >> internal commands, right now, we need a reserved tag. Or a way to "borrow" >> the tag from a request that we need to check. Which means we need some >> additional api to not always try to allocate a tag. >> >>> >>> Maybe we can stop processing when we receive an error (should be doing >>> that anyway as otherwise the log might be overwritten), then we should >>> be having a pretty good chance of getting that tag. >> >> Hmmm.... that would be no better than using EH which does stop processing >> until the internal house keeping is done. >> >>> Or, precisely, getting _any_ tag as at least one tag is free at that point. >>> Hmm? >> >> See above. Not free, but usable as far as the device is concerned since we >> have at least on command we need to check completed at the device level >> (but not yet completed from scsi/block layer point of view). >> > So, having had an entire weekend pondering this issue why don't we > allocate an _additional_ set of requests? > After all, we had been very generous with allocating queues and requests > (what with us doing a full provisioning of the requests for all queues > already for the non-shared tag case). > > Idea would be to keep the single tag bitmap, but add eg a new rq state > MQ_RQ_ERROR. Once that flag is set we'll fetch the error request instead > of the normal one: > > @@ -761,6 +763,8 @@ static inline struct request > *blk_mq_tag_to_rq(struct blk_mq_tags *tags, > { > if (tag < tags->nr_tags) { > prefetch(tags->rqs[tag]); > + if (unlikely(blk_mq_request_error(tags->rqs[tag]))) > + return tags->error_rqs[tag]; > return tags->rqs[tag]; > } > > and, of course, we would need to provision the error request first. > > Rationale here is that this will be primarily for devices with a low > number of tags, so doubling the number of request isn't much of an > overhead (as we'll be doing it essentially anyway in the error case as > we'll have to save the original request _somewhere_), and that it would > remove quite some cruft from the subsystem; look at SCSI EH trying to > store the original request contents and then after EH restoring them again. Interesting idea. I like it. It is essentially a set of reserved requests without reserved tags: the tag to use for these requests would be provided "manually" by the user. Right ? That should allow simplifying any processing that needs to reuse a tag, and currently its request. That is, CDL, but also usb-scsi, scsi EH and the few scsi LLDs using scsi_eh_prep_cmnd()+scsi_eh_restore_cmnd(). Ideally, these 2 functions could go away too. > > Hmm? > > Cheers, > > Hannes -- Damien Le Moal Western Digital Research