Hi Ming, > On Wed, Oct 27, 2021 at 09:16:32AM -0700, Keith Busch wrote: > > On Wed, Oct 27, 2021 at 11:58:23PM +0800, Ming Lei wrote: > > > On Wed, Oct 27, 2021 at 11:44:04AM -0400, Martin K. Petersen wrote: > > > > > > > > Ming, > > > > > > > > > request with scsi_cmnd may be allocated by the ufshpb driver, even it > > > > > should be fine to call ufshcd_queuecommand() directly for this driver > > > > > private IO, if the tag can be reused. One example is scsi_ioctl_reset(). > > > > > > > > scsi_ioctl_reset() allocates a new request, though, so that doesn't > > > > solve the forward progress guarantee. Whereas eh puts the saved request > > > > on the stack. > > > > > > What I meant is to use one totally ufshpb private command allocated from > > > private slab to replace the spawned request, which is sent to ufshcd_queuecommand() > > > directly, so forward progress is guaranteed if the blk-mq request's tag can be > > > reused for issuing this private command. This approach takes a bit effort, > > > but avoids tags reservation. > > > > > > Yeah, it is cleaner to use reserved tag for the spawned request, but we > > > need to know: > > > > > > 1) how many queue depth for the hba? If it is small, even 1 reservation > > > can affect performance. > > > > > > 2) how many inflight write buffer commands are to be supported? Or how many > > > is enough for obtaining expected performance? If the number is big, reserved > > > tags can't work. > > > > The original and clone are not dispatched to hardware concurrently, so I > > don't think the reserved_tags need to subtract from the generic ones. > > The original request already accounts for the hardware resource, so the > > clone doesn't need to consume another one. > > Yeah, that is why I thought the tag could be reused for the spawned(cloned) > request, but it needs ufshpb developer to confirm, or at least > ufshcd_queuecommand() can handle this situation. If that is true, it isn't > necessary to use reserve tags, since the current blk-mq implementation > requires to reserve real hardware tags space, which has to take normal > tags. It is true that pre-request can use the tag of READ request, but the READ request should wait to completion of the pre-request command. However, if the pre-request and the READ request are dispatched concurrently, it can save the time to completion of the pre-request. So I implemented as allocating new request and it has limit time to getting pre-request, so it doesn't cause deadlock. Thanks, Daejun