On Tue, May 21, 2024 at 10:51:17PM -0400, Mike Snitzer wrote: > Otherwise, blk_validate_limits() will throw-away the max_sectors that > was stacked from underlying device(s). In doing so it can set a > max_sectors limit that violates underlying device limits. Hmm, yes it sort of is "throwing the limit away", but it really recalculates it from max_hw_sectors, max_dev_sectors and user_max_sectors. > > This caused dm-multipath IO failures like the following because the > underlying devices' max_sectors were stacked up to be 1024, yet > blk_validate_limits() defaulted max_sectors to BLK_DEF_MAX_SECTORS_CAP > (2560): I suspect the problem is that SCSI messed directly with max_sectors instead and ignores max_user_sectors (and really shouldn't touch either, but that's a separate discussion). Can you try the patch below and maybe also provide the sysfs output for max_sectors_kb and max_hw_sectors_kb for all involved devices? diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 332eb9dac22d91..f6c822c9cbd2d3 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3700,8 +3700,10 @@ static int sd_revalidate_disk(struct gendisk *disk) */ if (sdkp->first_scan || q->limits.max_sectors > q->limits.max_dev_sectors || - q->limits.max_sectors > q->limits.max_hw_sectors) + q->limits.max_sectors > q->limits.max_hw_sectors) { q->limits.max_sectors = rw_max; + q->limits.max_user_sectors = rw_max; + } sdkp->first_scan = 0;