On Tue, Oct 25, 2022 at 06:58:19PM +0300, Sagi Grimberg wrote: >> and even more so the special start/end calls in all >> the transport drivers. > > The end is centralized and the start part has not sprinkled to > the drivers. I don't think its bad. Well. We need a new magic helper instead of blk_mq_start_request, and a new call to nvme_mpath_end_request in the lower driver to support functionality in the multipath driver that sits above them. This is because of the hack of storing the start_time in the nvme_request, which is realy owned by the lower driver, and quite a bit of a layering violation. If the multipath driver simply did the start and end itself things would be a lot better. The upside of that would be that it also accounts for the (tiny) overhead of the mpath driver. The big downside would be that we'd have to allocate memory just for the start_time as nvme-multipath has no per-I/O data structure of it's own. In a way it would be nice to just have a start_time in the bio, which would clean up the interface a lot, and also massively simplify the I/O accounting in md. But Jens might not be willing to grow the bio for this special case, even if some things in the bio seem even more obscure. >> the stats sysfs attributes already have the entirely separate >> blk-mq vs bio based code pathes. So I think having a block_device >> operation that replaces part_stat_read_all which allows nvme to >> iterate over all pathes and collect the numbers would seem >> a lot nicer. There might be some caveats like having to stash >> away the numbers for disappearing paths, though. > > You think this is better? really? I don't agree with you, I think its > better to pay a small cost than doing this very specialized thing that > will only ever be used for nvme-mpath. Yes, I think a callout at least conceptually would be much better.