On Mon, May 20, 2024 at 08:45:28PM -0400, Mike Snitzer wrote: > On Mon, May 20, 2024 at 06:03:11PM -0400, Mike Snitzer wrote: > > On Mon, May 20, 2024 at 10:12:37PM +0200, Christoph Hellwig wrote: > > > On Mon, May 20, 2024 at 01:17:46PM -0400, Mike Snitzer wrote: > > > > Doubt there was anything in fstests setting max discard user limit > > > > (max_user_discard_sectors) in Ted's case. blk_set_stacking_limits() > > > > sets max_user_discard_sectors to UINT_MAX, so given the use of > > > > min(lim->max_hw_discard_sectors, lim->max_user_discard_sectors) I > > > > suspect blk_stack_limits() stacks up max_discard_sectors to match the > > > > underlying storage's max_hw_discard_sectors. > > > > > > > > And max_hw_discard_sectors exceeds BIO_PRISON_MAX_RANGE, resulting in > > > > dm_cell_key_has_valid_range() triggering on: > > > > WARN_ON_ONCE(key->block_end - key->block_begin > BIO_PRISON_MAX_RANGE) > > > > > > Oh, that makes more sense. > > > > > > I think you just want to set the max_hw_discard_sectors limit before > > > stacking in the lower device limits so that they can only lower it. > > > > > > (and in the long run we should just stop stacking the limits except > > > for request based dm which really needs it) > > > > This is what I staged, I cannot send a patch out right now.. > > > > Ted if you need the patch in email (rather than from linux-dm.git) I > > can send it later tonight. Please see: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-6.10&id=825d8bbd2f32cb229c3b6653bd454832c3c20acb > > From: Mike Snitzer <snitzer@xxxxxxxxxx> > Date: Mon, 20 May 2024 13:34:06 -0400 > Subject: [PATCH] dm: always manage discard support in terms of max_hw_discard_sectors > > Commit 4f563a64732d ("block: add a max_user_discard_sectors queue > limit") changed block core to set max_discard_sectors to: > min(lim->max_hw_discard_sectors, lim->max_user_discard_sectors) > > Since commit 1c0e720228ad ("dm: use queue_limits_set") it was reported > dm-thinp was failing in a few fstests (generic/347 and generic/405) > with the first WARN_ON_ONCE in dm_cell_key_has_valid_range() being > reported, e.g.: > WARNING: CPU: 1 PID: 30 at drivers/md/dm-bio-prison-v1.c:128 dm_cell_key_has_valid_range+0x3d/0x50 > > blk_set_stacking_limits() sets max_user_discard_sectors to UINT_MAX, > so given how block core now sets max_discard_sectors (detailed above) > it follows that blk_stack_limits() stacks up the underlying device's > max_hw_discard_sectors and max_discard_sectors is set to match it. If > max_hw_discard_sectors exceeds dm's BIO_PRISON_MAX_RANGE, then > dm_cell_key_has_valid_range() will trigger the warning with: > WARN_ON_ONCE(key->block_end - key->block_begin > BIO_PRISON_MAX_RANGE) > > Aside from this warning, the discard will fail. Fix this and other DM > issues by governing discard support in terms of max_hw_discard_sectors > instead of max_discard_sectors. > > Reported-by: Theodore Ts'o <tytso@xxxxxxx> > Fixes: 1c0e720228ad ("dm: use queue_limits_set") > Signed-off-by: Mike Snitzer <snitzer@xxxxxxxxxx> With this patch applied, I verified xfstests generic/347 and generic/405 no longer trigger the dm_cell_key_has_valid_range WARN_ON_ONCE. I'm sending the fix to Linus now.