On 09/08/2017 01:58 PM, Mike Snitzer wrote: > On Fri, Sep 08 2017 at 1:07pm -0400, > Mike Snitzer <snitzer@xxxxxxxxxx> wrote: > >> On Fri, Sep 08 2017 at 12:48pm -0400, >> Jens Axboe <axboe@xxxxxxxxx> wrote: >> >>>> Please see the following untested patch. All >>>> testing/review/comments/acks appreciated. >>>> >>>> I elected to use elevator_change() rather than fiddle with adding a new >>>> blk-mq elevator hook (e.g. ->request_prepared) to verify that each >>>> blk-mq elevator enabled request did in fact get prepared. >>>> >>>> Bart, please test this patch and reply with your review/feedback. >>>> >>>> Jens, if you're OK with this solution please reply with your Ack and >>>> I'll send it to Linus along with the rest of the handful of DM changes I >>>> have for 4.14. >>> >>> I am not - we used to have this elevator change functionality from >>> inside the kernel, and finally got rid of it when certain drivers killed >>> it. I don't want to be bringing it back. >> >> Fine. > > BTW, while I conceded "Fine": I think your justification for not > reintroducing elevator_change() lacks substance. What is inherently > problematic about elevator_change()? Because no in-kernel users should be mucking with the IO scheduler. Adding this back is just an excuse for drivers to start doing it again, which generally happens because whatever vendors driver team tests some synthetic benchmark and decide that X is better than the default of Y. So we're not going back to that. > Having an elevator attached to a DM multipath device's underlying path's > request_queue just asks for trouble (especially given the blk-mq > elevator interface). > > Please own this issue as a regression and help me arrive at a timely way > forward. I'm trying, I made suggestions on how we can proceed - we can have a way to insert to hctx->dispatch without bothering the IO scheduler. I'm open to other suggestions as well, just not open to exporting an interface to change IO schedulers from inside the kernel. -- Jens Axboe