Re: [PATCH] block: elevator: avoid to load iosched module from this disk

Damien Le Moal <dlemoal@xxxxxxxxxx> · Sun, 8 Sep 2024 09:02:51 +0900

On 9/7/24 20:14, Richard W.M. Jones wrote:
> On Sat, Sep 07, 2024 at 07:02:30PM +0800, Ming Lei wrote:
>> BTW, the issue can be reproduced 100% by:
>>
>> echo "deadlock" > /sys/block/$ROOT_DISK/queue/scheduler

This probably should be:

echo "mq-deadline" > /sys/block/$ROOT_DISK/queue/scheduler

and make sure that:
1) mq-deadline is compiled as a module
2) mq-deadline is not already used by a device (so not loaded already)
3) The mq-deadline module file is stored on the target device of the scheduler
change
4) The mq-deadline module file is not already cahced in the page cache.

For (4), you may want to do a "echo 3 > /proc/sys/vm/drop_caches" before trying
to switch the scheduler.

> 
> That doesn't reproduce it for me (reliably).  Although I'm not
> surprised as this bug has been _very_ tricky to reproduce!  Sometimes
> I think I have a definite reproducer, only for it to go away when some
> tiny detail changes.
> 
>>> This seems like the neatest (or shortest) fix so far, but doesn't it
>>> "mix up layers" by checking elv_iosched_store?
>>
>> It is just one exception for 'scheduler' sysfs attribute wrt. freezing
>> queue for storing, and the check can be done via the attribute
>> name("scheduler") too.
> 
> Fair enough.
> 
> Rich.
> 

-- 
Damien Le Moal
Western Digital Research