Re: [GIT PULL] Block fix for 6.11-final

Jens Axboe <axboe@xxxxxxxxx> · Thu, 12 Sep 2024 18:39:27 -0600

On 9/12/24 5:21 PM, Linus Torvalds wrote:
> On Thu, 12 Sept 2024 at 15:44, Jens Axboe <axboe@xxxxxxxxx> wrote:
>>
>> Just a single fix for a deadlock issue that can happen if someone
>> attempts to change the root disk IO scheduler with a module that
>> requires loading from disk. Changing the scheduler freezes the queue
>> while that operation is happening, hence causing a deadlock.
> 
> Side note: I do think that doing the blk_mq_freeze_queue() outside the
> sysfs_lock mutex is also a mistake, and will deadlock if anybody then
> needs to do any IO (like a user space access) inside the sysfs_lock
> mutex somewhere else.
> 
> It wasn't what caused Jesper's problems, and maybe nothing actually
> does that, but it still looks rather questionable in
> queue_attr_store().
> 
> I mean, imagine holding q->sysfs_lock, and doing something as simple
> as just a memory allocation that wants to do swapping, but somebody
> else did that queue_attr_store(), which freezed the queues and is now
> waiting for the lock and won't unfreeze them until it gets it...
> 
> Yeah, yeah, very very unlikely to hit in real life, but still. Seems
> very wrong.

Yep I agree, it does feel like the wrong thing and could deadlock under
reclaim. I didn't get around to replying to the other one, but I didn't
want to risk changing the ordering right before release. Will do a
separate patch for testing.

-- 
Jens Axboe