Re: [PATCH 3/7] dm: handle failures in dm_table_set_restrictions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/11/25 02:37, Benjamin Marzinski wrote:
> On Mon, Mar 10, 2025 at 08:25:39AM +0900, Damien Le Moal wrote:
>> On 3/10/25 07:28, Benjamin Marzinski wrote:
>>> If dm_table_set_restrictions() fails while swapping tables,
>>> device-mapper will continue using the previous table. It must be sure to
>>> leave the mapped_device in it's previous state on failure.  Otherwise
>>> device-mapper could end up using the old table with settings from the
>>> unused table.
>>>
>>> Do not update the mapped device in dm_set_zones_restrictions(). Wait
>>> till after dm_table_set_restrictions() is sure to succeed to update the
>>> md zoned settings. Do the same with the dax settings, and if
>>> dm_revalidate_zones() fails, restore the original queue limits.
>>>
>>> Signed-off-by: Benjamin Marzinski <bmarzins@xxxxxxxxxx>
>>> ---
>>>  drivers/md/dm-table.c | 24 ++++++++++++++++--------
>>>  drivers/md/dm-zone.c  | 26 ++++++++++++++++++--------
>>>  drivers/md/dm.h       |  1 +
>>>  3 files changed, 35 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
>>> index 0ef5203387b2..4003e84af11d 100644
>>> --- a/drivers/md/dm-table.c
>>> +++ b/drivers/md/dm-table.c
>>> @@ -1836,6 +1836,7 @@ int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q,
>>>  			      struct queue_limits *limits)
>>>  {
>>>  	int r;
>>> +	struct queue_limits old_limits;
>>>  
>>>  	if (!dm_table_supports_nowait(t))
>>>  		limits->features &= ~BLK_FEAT_NOWAIT;
>>> @@ -1862,16 +1863,11 @@ int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q,
>>>  	if (dm_table_supports_flush(t))
>>>  		limits->features |= BLK_FEAT_WRITE_CACHE | BLK_FEAT_FUA;
>>>  
>>> -	if (dm_table_supports_dax(t, device_not_dax_capable)) {
>>> +	if (dm_table_supports_dax(t, device_not_dax_capable))
>>>  		limits->features |= BLK_FEAT_DAX;
>>> -		if (dm_table_supports_dax(t, device_not_dax_synchronous_capable))
>>> -			set_dax_synchronous(t->md->dax_dev);
>>> -	} else
>>> +	else
>>>  		limits->features &= ~BLK_FEAT_DAX;
>>>  
>>> -	if (dm_table_any_dev_attr(t, device_dax_write_cache_enabled, NULL))
>>> -		dax_write_cache(t->md->dax_dev, true);
>>> -
>>>  	/* For a zoned table, setup the zone related queue attributes. */
>>>  	if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
>>>  	    (limits->features & BLK_FEAT_ZONED)) {
>>> @@ -1883,6 +1879,7 @@ int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q,
>>>  	if (dm_table_supports_atomic_writes(t))
>>>  		limits->features |= BLK_FEAT_ATOMIC_WRITES;
>>>  
>>> +	old_limits = q->limits;
>>
>> I am not sure this is safe to do like this since the user may be simultaneously
>> changing attributes, which would result in the old_limits struct being in an
>> inconsistent state. So shouldn't we hold q->limits_lock here ? We probably want
>> a queue_limits_get() helper for that though.
>>
>>>  	r = queue_limits_set(q, limits);
>>
>> ...Or, we could modify queue_limits_set() to also return the old limit struct
>> under the q limits_lock. That maybe easier.
> 
> If we disallow switching between zoned devices then this is unnecssary.
> OTherwise you're right. We do want to make sure that we don't grep the
> limits while something is updating the limits.
> 
> Unfortunately, thinking about this just made me realize a different
> problem, that has nothing to do with this patchset. bio-based devices
> can't handle freezing the queue while there are plugged zone write bios.
> So, for instance, if you do something like:
> 
> # modprobe scsi_debug dev_size_mb=512 zbc=managed zone_size_mb=128 zone_nr_conv=0 delay=20
> # dmsetup create test --table "0 1048576 crypt aes-cbc-essiv:sha256 deadbeefdeadbeefdeadbeefdeadbeef 0 /dev/sda 0"
> # dd if=/dev/zero of=/dev/mapper/test bs=1M count=128 &
> # echo 0 > /sys/block/dm-1/queue/iostats
> 
> you hang.

Jens just applied a patch series cleaning up locking around limits and queue
freeze vs locking ordering. So we should check if this issue is still there with
these patches. Will try to test that.

-- 
Damien Le Moal
Western Digital Research




[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux