On Fri, Apr 05, 2024 at 08:21:14AM +0200, Hannes Reinecke wrote: > On 4/4/24 23:14, Keith Busch wrote: > > On Wed, Apr 03, 2024 at 04:17:54PM +0200, Hannes Reinecke wrote: > > > Hi all, > > > > > > there had been several attempts to implement a latency-based I/O > > > scheduler for native nvme multipath, all of which had its issues. > > > > > > So time to start afresh, this time using the QoS framework > > > already present in the block layer. > > > It consists of two parts: > > > - a new 'blk-nlatency' QoS module, which is just a simple per-node > > > latency tracker > > > - a 'latency' nvme I/O policy > > Whatever happened with the io-depth based path selector? That should > > naturally align with the lower latency path, and that metric is cheaper > > to track. > > Turns out that tracking queue depth (on the NVMe level) always requires > an atomic, and with that a performance impact. > The qos/blk-stat framework is already present, and as the numbers show > actually leads to a performance improvement. > > So I'm not quite sure what the argument 'cheaper to track' buys us here... I was considering the blk_stat framework compared to those atomic operations. I usually don't enable stats because all the extra ktime_get_ns() and indirect calls are relatively costly. If you're enabling stats anyway though, then yeah, I guess I don't really have a point and your idea here seems pretty reasonable.