3. Do you have some performance numbers (we're touching the fast path
here) ?
This is pretty light-weight, accounting is per-cpu and only wrapped by
preemption disable. This is a very small price to pay for what we gain.
It does add up, though, and some environments disable stats to skip the
overhead. At a minimum, you need to add a check for blk_queue_io_stat() before
assuming you need to account for stats.
Instead of duplicating the accounting, could you just have the stats file report
the sum of its hidden devices?
Interesting...
How do you suggest we do that? .collect_stats() callout in fops?
Maybe, yeah. I think we'd need something to enumerate the HIDDEN disks that
make up the multipath device. Only the low-level driver can do that right now,
so perhaps either call into the driver to get all the block_device parts, or
the gendisk needs to maintain a list of those parts itself.
I definitely don't think we want to propagate the device relationship to
blk-mq. But a callback to the driver also seems very niche to nvme
multipath and is also kinda messy to combine calculations like
iops/bw/latency accurately which depends on the submission distribution
to the bottom devices which we would need to track now.
I'm leaning towards just moving forward with this, take the relatively
small hit, and if people absolutely care about the extra latency, then
they can disable it altogether (upper and/or bottom devices).