Re: [PATCH v4 0/3] Support disabling fair tag sharing

Ming Lei <ming.lei@xxxxxxxxxx> · Thu, 26 Oct 2023 07:37:54 +0800

On Wed, Oct 25, 2023 at 12:01:33PM -0700, Bart Van Assche wrote:
> 
> On 10/24/23 18:33, Ming Lei wrote:
> > Yeah, performance does drop when queue depth is cut to half if queue
> > depth is low enough.
> > 
> > However, it isn't enough to just test perf over one LUN, what is the
> > perf effect when running IOs over the 2 or 5 data LUNs
> > concurrently?
> 
> I think that the results I shared are sufficient because these show the
> worst possible performance impact of fair tag sharing (two active
> logical units and much more activity on one logical unit than on the
> other).

You are talking about multi-lun case, and your change does affect
multi-lun code path, but your test result doesn't cover multi-lun,
is it enough?

At least your patch shouldn't cause performance regression on multi-lun IO
workloads, right?

> 
> > SATA should have similar issue too, and I think the improvement may be
> > more generic to bypass fair tag sharing in case of low queue depth
> > (such as < 32) if turns out the fair tag sharing doesn't work well in
> > case low queue depth.
> > 
> > Also the 'fairness' could be enhanced dynamically by scsi LUN's
> > queue depth, which can be adjusted dynamically.
> 
> Most SATA devices are hard disks. Hard disk IOPS are constrained by the
> speed with which the head of a hard disk can move. That speed hasn't
> changed much during the past 40 years. I'm not sure that hard disks are
> impacted as much as SSD devices by fair tag sharing.

What I meant is that SATA's queue depth is often 32 or 31, and still have
multi-lun cases.

At least from what you shared, the fair tag sharing doesn't work well
just because of low queue depth, nothing is actually related with UFS.

That is why I am wondering that why not force to disable fairing sharing
in case of low queue depth.

> 
> Any algorithm that is more complicated than what I posted probably would
> have a negative performance impact on storage devices that use NAND
> technology, e.g. UFS devices. So I prefer to proceed with this patch
> series and solve any issues with ATA devices separately. Once this patch
> series has been merged, it could be used as a basis for a solution for
> ATA devices. A solution for ATA devices does not have to be implemented
> in the block layer core - it could e.g. be implemented in the ATA subsystem.

I don't object to take the disabling fair sharing first, and I meant that
the fairness may be brought back by adjusting scsi_device's queue depth in
future.

Thanks,
Ming