On Tue, Sep 17, 2024 at 02:48:06PM +0200, Damien Le Moal wrote: > On 2024/09/17 14:33, Ming Lei wrote: > > On Tue, Sep 17, 2024 at 1:53 PM Christoph Hellwig <hch@xxxxxx> wrote: > >> > >> On Tue, Sep 17, 2024 at 02:32:58PM +0900, Damien Le Moal wrote: > >>> Commit 734e1a860312 ("block: Prevent deadlocks when switching > >>> elevators") introduced the function elv_iosched_load_module() to allow > >>> loading an elevator module outside of elv_iosched_store() with the > >>> target device queue not frozen, to avoid deadlocks. However, the "none" > >>> scheduler does not have a module and as a result, > >>> elv_iosched_load_module() always returns an error when trying to switch > >>> to this valid scheduler. > >>> > >>> Fix this by checking that the requested scheduler is "none" and doing > >>> nothing in that case. > >> > >> The old code before this commit simply ignored the request_module, > >> just as most callers of it do. I think that's the right approach > >> here as well. > > > > freeze queue is actually easy to cause deadlock, and old code is to not > > do it everywhere. > > > > Probably it may be more reliable to replace 'load_module' with 'no_freeze', > > and not to freeze queue in case that 'no_freeze' is set in queue_attr_store(). > > load_module or whatever the name you prefer, should NOT imply that freezing the > queue is not necessary. Switching the IO scheduler is really one such case. > Switching the scheduler really needs to be done with the queue frozen, but the > scheduler module loading needs to be done with the queue live. Here 'no_freeze' means that automatic 'freeze queue' isn't needed, or it can be named as 'no_auto_freeze'. Again, 'load_module' is one bad name from interface viewpoint, which is just needed by 'scheduler' only. > > > queue_wb_lat_store() need no_freeze too since there is GFP_KERNEL > > allocation involved. > > No, because again the attribute may need to have the queue frozen to correctly > be changed. To avoid hangs, what is needed is to force a GFP_NOIO context before > calling the attribute ->store() operation. Doing so, any memory allocation that > the attribute change may need will not cause re-entry into a frozen queue (which > would result in a hang). > > This is easy to do with memalloc_noio_save()/memalloc_noio_restore(). But why do we need that? Just for paper over the problem caused by the unnecessary freeze queue? Thanks, Ming