Re: [PATCH 1/2] block: avoid to hold q->limits_lock across APIs for atomic update queue limits

Nilay Shroff <nilay@xxxxxxxxxxxxx> · Thu, 19 Dec 2024 12:46:29 +0530

On 12/19/24 11:50, Christoph Hellwig wrote:
> On Wed, Dec 18, 2024 at 06:57:45AM -0800, Damien Le Moal wrote:
>>> Yeah agreed but I see sd_revalidate_disk() is probably the only exception 
>>> which allocates the blk-mq request. Can't we fix it? 
>>
>> If we change where limits_lock is taken now, we will again introduce races
>> between user config and discovery/revalidation, which is what
>> queue_limits_start_update() and queue_limits_commit_update() intended to fix in
>> the first place.
>>
>> So changing sd_revalidate_disk() is not the right approach.
> 
> Well, sd_revalidate_disk is a bit special in that it needs a command
> on the same queue to query the information.  So it needs to be able
> to issue commands without the queue frozen.  Freezing the queue inside
> the limits lock support that, sd just can't use the convenience helpers
> that lock and freeze.
> 
>> This is overly complicated ... As I suggested, I think that a simpler approach
>> is to call blk_mq_freeze_queue() and blk_mq_unfreeze_queue() inside
>> queue_limits_commit_update(). Doing so, no driver should need to directly call
>> freeze/unfreeze. But that would be a cleanup. Let's first fix the few instances
>> that have the update/freeze order wrong. As mentioned, the pattern simply needs
> 
> Yes, the queue only needs to be frozen for the actual update,
> which would remove the need for the locking.  The big question for both
> variants is if we can get rid of all the callers that have the queue
> already frozen and then start an update.
> 
Yes agreed that in most cases we only needs the queue to be frozen while 
committing the update, however we do have few call sites (in nvme driver)
where I see we freeze queue before actually starting update. And looking 
at those call sites it seems that we probably do require freezing the 
queue. One example from NVMe driver,

nvme_update_ns_info_block()
{
    ...
    ...

    blk_mq_freeze_queue(ns->disk->queue);
    ns->head->lba_shift = id->lbaf[lbaf].ds;
    ns->head->nuse = le64_to_cpu(id->nuse);
    capacity = nvme_lba_to_sect(ns->head, le64_to_cpu(id->nsze));

    lim = queue_limits_start_update(ns->disk->queue);
    ...
    ...
    queue_limits_commit_update();
    ...
    set_capacity_and_notify(ns->disk, capacity);
    ...
    set_disk_ro(ns->disk, nvme_ns_is_readonly(ns, info));
    set_bit(NVME_NS_READY, &ns->flags);
    blk_mq_unfreeze_queue(ns->disk->queue);
    ...
}

So looking at the above example, I earlier proposed  freezing the queue 
in queue_limits_start_update() and then unfreezing the queue in 
queue_limits_commit_update(). In the above code then we could replace 
blk_mq_freeze_queue() with queue_limits_start_update() and 
blk_mq_unfreeze_queue() with queue_limits_commit_update() and get rid 
of the original call sites of start/commit update APIs. Having said 
that, I am open for any other better suggestions and one of the suggestion 
is from Damien about calling blk_mq_freeze_queue() and blk_mq_unfreeze_queue() 
inside queue_limits_commit_update(). But then I wonder how would we fix the 
call sites as shown above with this approach.

Thanks,
--Nilay