Re: [PATCH 2/2] ufs: don't use the fair tag sharings

Bart Van Assche <bvanassche@xxxxxxx> · Fri, 12 May 2023 11:12:40 -0700

On 5/12/23 07:02, Christoph Hellwig wrote:
On Thu, May 11, 2023 at 08:38:04AM -0700, Bart Van Assche wrote:
For which devices is the fair sharing algorithm useful? As far as I know the
legacy block layer did not have an equivalent of the fair sharing algorithm
and I'm not aware of any complaints about the legacy block layer regarding
to fairness. This is why I proposed in January to remove the fair sharing
code entirely. See also https://lore.kernel.org/linux-block/20230103195337.158625-1-bvanassche@xxxxxxx/.

Because the old code did not do tag allocation itself?  Either way I
don't think a "I'll opt out for a random driver" is the proper approach
when you think it's not needed.  Especially not without any data
explaining why just that driver is a special snowflake.

Hi Christoph,

I'm still wondering whether there are any drivers that benefit from the 
fair tag sharing algorithm. If the number of tags is large enough 
(NVMe), the number of tags exceeds the number of requests in flight and 
hence the fair tag sharing algorithm is not necessary.

The fair tag sharing algorithm has a negative impact on all SCSI devices 
with multiple logical units. This is because logical units are 
considered active until (request timeout) seconds have elapsed after the 
logical unit stopped being used (see also the blk_mq_tag_idle() call in 
blk_mq_timeout_work()). UFS users are hit by this because UFS 3.0 
devices have a limited queue depth (32) and because power management 
commands are submitted to a logical unit (WLUN). Hence, it happens often 
that the block layer "active queue" counter is equal to 2 while only one 
logical unit is being used actively (a logical unit backed by NAND 
flash). The performance difference between queue depths 16 and 32 for 
UFS devices is significant.

Is my understanding correct that in the legacy block layer 
implementation blk_queue_start_tag() had to be called to assign a tag to 
a request? I haven't found any code in the Linux kernel v4.20 
implementation of blk_queue_start_tag() that implements fairness in case 
a request tag map (struct blk_queue_tag) is shared across request queues 
(one request queue per logical unit in case of SCSI). Do you agree with 
my conclusion that from the point of view of the SCSI core in general 
and the UFS driver in particular the fair tag sharing algorithm in the 
blk-mq code introduced a performance regression?

Thanks,

Bart.