On Thu, 2024-04-18 at 09:42 +0200, Hannes Reinecke wrote: > On 4/18/24 09:03, Christoph Hellwig wrote: > > On Thu, Apr 18, 2024 at 09:00:15AM +0200, Hannes Reinecke wrote: > > > max_sectors can be modified via sysfs, but only in kb units. > > > > Yes. > > > > > Which leads to a misalignment on stacked devices if the original > > > max_sector size is an odd number. > > > > How? > > > > That's an issue we've been seeing during testing: > https://lore.kernel.org/dm-devel/7742003e19b5a49398067dc0c59dfa8ddeffc3d7.camel@xxxxxxxx/ > > While this can be fixed in userspace (Martin Wilck provided a > patchset > to multipath-tools), what I find irritating is that we will always > display the max_sectors setting in kb, even if the actual value is > not > kb aligned. > _And_ we allow to modify that value (again in kb units). Which means > that we _never_ are able to reset it to its original value. User space has no way to determine whether the actual max_sectors value in the kernel is even or odd. By reading max_sectors_kb and writing it back, one may actually change the kernel-internal value by rounding it down to the next even number. This can cause breakage if the device being changed is a multipath path device. Wrt Hannes' patch: It would fix the issue on the kernel side, but user space would still have no means to determine whether this patch applied or not, except for checking the kernel version, which is unreliable. For user space, it would be more helpful to add a "max_sectors" sysfs attribute that exposes the actual value in blocks. > > Note that we really should not stack max_sectors anyway, as it's > > only > > used for splitting in the lower device to start with. > > If that's the case, why don't we inhibit the modification for > max_sectors on the lower devices? I vote for allowing changes to max_sectors_kb only for devices that don't have anything stacked on top, even though my late testing indicates that it's only a real problem with request-based dm aka multipath. After all, max_sectors only needs to be manipulated in rare situations, and would be generally recommended to do this in an udev rule early during boot. Regards, Martin