On Tue, Nov 06 2018 at 4:34pm -0500, Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote: > Hi > > These are the device mapper percpu patches. > > Note that I didn't test request-based device mapper because I don't have > hardware for it (the patches don't convert request-base targets to percpu > values, but there are a few inevitable changes in dm-rq.c). Patches 1 - 3 make sense. But the use of percpu inflight counters isn't something I can get upstream. Any more scalable counter still needs to be wired up to the block stats interfaces (the one you did in patch 5 is only for the "inflight" fsffs file, there is also the generic diskstats callout to part_in_flight(), etc). Wiring up both part_in_flight() and part_in_flight_rw() to optionally callout to a new callback isn't going to fly.. especially if that callout is looping up the sum of percpu counters. I checked with Jens and now that in 4.21 all of the old request-based IO path is gone (and given that blk-mq bypasses use of ->in_flight[]): the only consumer of the existing ->in_flight[] is the bio-based IO path. Given that now only bio-based is consuming it, and your work was focused on making bio-based DM's "pending" IO accounting more scalable, it is best to just change block core's ->in_flight[] directly. But Jens is against switching to using percpu counters because they are really slow when summing the counts. And diskstats does that frequently. Jens said at least 2 other attempts were made and rejected to switch over to percpu counters. Jens' suggestion is to implement a new generic rolling per-node counter. Would you be open to trying that? Mike -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel