On 2024/09/17 14:33, Ming Lei wrote: > On Tue, Sep 17, 2024 at 1:53 PM Christoph Hellwig <hch@xxxxxx> wrote: >> >> On Tue, Sep 17, 2024 at 02:32:58PM +0900, Damien Le Moal wrote: >>> Commit 734e1a860312 ("block: Prevent deadlocks when switching >>> elevators") introduced the function elv_iosched_load_module() to allow >>> loading an elevator module outside of elv_iosched_store() with the >>> target device queue not frozen, to avoid deadlocks. However, the "none" >>> scheduler does not have a module and as a result, >>> elv_iosched_load_module() always returns an error when trying to switch >>> to this valid scheduler. >>> >>> Fix this by checking that the requested scheduler is "none" and doing >>> nothing in that case. >> >> The old code before this commit simply ignored the request_module, >> just as most callers of it do. I think that's the right approach >> here as well. > > freeze queue is actually easy to cause deadlock, and old code is to not > do it everywhere. > > Probably it may be more reliable to replace 'load_module' with 'no_freeze', > and not to freeze queue in case that 'no_freeze' is set in queue_attr_store(). load_module or whatever the name you prefer, should NOT imply that freezing the queue is not necessary. Switching the IO scheduler is really one such case. Switching the scheduler really needs to be done with the queue frozen, but the scheduler module loading needs to be done with the queue live. > queue_wb_lat_store() need no_freeze too since there is GFP_KERNEL > allocation involved. No, because again the attribute may need to have the queue frozen to correctly be changed. To avoid hangs, what is needed is to force a GFP_NOIO context before calling the attribute ->store() operation. Doing so, any memory allocation that the attribute change may need will not cause re-entry into a frozen queue (which would result in a hang). This is easy to do with memalloc_noio_save()/memalloc_noio_restore(). > > Thanks, > -- Damien Le Moal Western Digital Research