On 02/14/2017 01:14 AM, Paolo Valente wrote: > >> Il giorno 14 feb 2017, alle ore 00:10, Jens Axboe <axboe@xxxxxxxxx> ha scritto: >> >> On 02/13/2017 03:28 PM, Jens Axboe wrote: >>> On 02/13/2017 03:09 PM, Omar Sandoval wrote: >>>> On Mon, Feb 13, 2017 at 10:01:07PM +0100, Paolo Valente wrote: >>>>> If, at boot, a legacy I/O scheduler is chosen for a device using blk-mq, >>>>> or, viceversa, a blk-mq scheduler is chosen for a device using blk, then >>>>> that scheduler is set and initialized without any check, driving the >>>>> system into an inconsistent state. This commit addresses this issue by >>>>> letting elevator_get fail for these wrong cross choices. >>>>> >>>>> Signed-off-by: Paolo Valente <paolo.valente@xxxxxxxxxx> >>>>> --- >>>>> block/elevator.c | 26 ++++++++++++++++++-------- >>>>> 1 file changed, 18 insertions(+), 8 deletions(-) >>>> >>>> Hey, Paolo, >>>> >>>> How exactly are you triggering this? In __elevator_change(), we do check >>>> for mq or not mq: >>>> >>>> if (!e->uses_mq && q->mq_ops) { >>>> elevator_put(e); >>>> return -EINVAL; >>>> } >>>> if (e->uses_mq && !q->mq_ops) { >>>> elevator_put(e); >>>> return -EINVAL; >>>> } >>>> >>>> We don't ever appear to call elevator_init() with a specific scheduler >>>> name, and for the default we switch off of q->mq_ops and use the >>>> defaults from Kconfig: >>>> >>>> if (q->mq_ops && q->nr_hw_queues == 1) >>>> e = elevator_get(CONFIG_DEFAULT_SQ_IOSCHED, false); >>>> else if (q->mq_ops) >>>> e = elevator_get(CONFIG_DEFAULT_MQ_IOSCHED, false); >>>> else >>>> e = elevator_get(CONFIG_DEFAULT_IOSCHED, false); >>>> >>>> if (!e) { >>>> printk(KERN_ERR >>>> "Default I/O scheduler not found. " \ >>>> "Using noop/none.\n"); >>>> e = elevator_get("noop", false); >>>> } >>>> >>>> So I guess this could happen if someone manually changed those Kconfig >>>> options, but I don't see what other case would make this happen, could >>>> you please explain? >>> >>> Was wondering the same - is it using the 'elevator=' boot parameter? >>> Didn't look at that path just now, but that's the only one I could >>> think of. If it is, I'd much prefer only using 'chosen_elevator' for >>> the non-mq stuff, and the fix should be just that instead. >>> >>> So instead of: >>> >>> if (!e && *chosen_elevator) { >>> >>> do >>> >>> if (!e && !q->mq_ops && && *chosen_elevator) { >> >> Confirmed, that's what it seems to be, and here's a real diff of the >> above example that works for me: >> >> diff --git a/block/elevator.c b/block/elevator.c >> index 27ff1ed5a6fa..699d10f71a2c 100644 >> --- a/block/elevator.c >> +++ b/block/elevator.c >> @@ -207,11 +207,12 @@ int elevator_init(struct request_queue *q, char *name) >> } >> >> /* >> - * Use the default elevator specified by config boot param or >> - * config option. Don't try to load modules as we could be running >> - * off async and request_module() isn't allowed from async. >> + * Use the default elevator specified by config boot param for >> + * non-mq devices, or by config option. > > I don't fully get this choice: being able to change the default I/O > scheduler through the command line has been rather useful for me, > saving me a lot of recompilations, and such a feature seems widespread > among (at least power) users. However, mine is of course just an > opinion, and I may be missing the main point also in this case. The problem with the elevator= boot parameter is that it applies across everything, which makes very little sense, since it's a per device setting. In retrospect, it was a mistake to add this parameter, and I don't want to continue down that path with blk-mq. Why aren't you just using online switching through sysfs? For normal users, typically this would be done through udev rules. -- Jens Axboe