On Wed, May 22, 2024 at 12:48:59PM -0400, Mike Snitzer wrote: > On Wed, May 22, 2024 at 04:24:58PM +0200, Christoph Hellwig wrote: > > On Tue, May 21, 2024 at 10:51:17PM -0400, Mike Snitzer wrote: > > > Otherwise, blk_validate_limits() will throw-away the max_sectors that > > > was stacked from underlying device(s). In doing so it can set a > > > max_sectors limit that violates underlying device limits. > > > > Hmm, yes it sort of is "throwing the limit away", but it really > > recalculates it from max_hw_sectors, max_dev_sectors and user_max_sectors. > > Yes, but it needs to do that recalculation at each level of a stacked > device. And then we need to combine them via blk_stack_limits() -- as > is done with the various limits stacking loops in > drivers/md/dm-table.c:dm_calculate_queue_limits > > > > This caused dm-multipath IO failures like the following because the > > > underlying devices' max_sectors were stacked up to be 1024, yet > > > blk_validate_limits() defaulted max_sectors to BLK_DEF_MAX_SECTORS_CAP > > > (2560): > > > > I suspect the problem is that SCSI messed directly with max_sectors instead > > and ignores max_user_sectors (and really shouldn't touch either, but that's > > a separate discussion). Can you try the patch below and maybe also provide > > the sysfs output for max_sectors_kb and max_hw_sectors_kb for all involved > > devices? > > FYI, you can easily reproduce with: Running this (with your suggested edits) on Linus' current tree (commit c760b3725e52403dc1b28644fb09c47a83cacea6) doesn't show any failure even after dozens of runs. What am I doing wrong?