On Thu, Apr 22, 2021 at 09:07:15AM +0000, Martin Wilck wrote: > On Wed, 2021-04-21 at 22:46 -0400, Martin K. Petersen wrote: > > > > Martin, > > > > > Hm, it sounds intriguing, but it has issues in its own right. For > > > years to come, user space will have to probe whether these attribute > > > exist, and fall back to the current ones ("wwid", "vpd_pg83") > > > otherwise. So user space can't be simplified any time soon. Speaking > > > for an important user space consumer of WWIDs (multipathd), I doubt > > > that this would improve matters for us. We'd be happy if the kernel > > > could just pick the "best" designator for us. But I understand that > > > the kernel can't guarantee a good choice (user space can't either). > > > > But user space can be adapted at runtime to pick one designator over > > the > > other (ha!). > > And that's exactly the problem. Effectively, all user space relies on > udev today, because that's where this "adaptation" is taking place. It > happens > > 1) either in systemd's scsi_id built-in > (https://github.com/systemd/systemd/blob/7feb1dd6544d1bf373dbe13dd33cf563ed16f891/src/udev/scsi_id/scsi_serial.c#L37) > 2) or in the udev rules coming with sg3_utils > (https://github.com/hreinecke/sg3_utils/blob/master/scripts/55-scsi-sg3_id.rules) > > 1) is just as opaque and un-"adaptable" as the kernel, and the logic is > suboptimal. 2) is of course "adaptable", but that's a problem in > practice, if udev fails to provide a WWID. multipath-tools go through > various twists for this case to figure out "fallback" WWIDs, guessing > whether that "fallback" matches what udev would have returned if it had > worked. > > That's the gist of it - the general frustration about udev among some > of its heaviest users (talk to the LVM2 maintainers). > > I suppose 99.9% of users never bother with customizing the udev rules. > IOW, these users might as well just use a kernel-provided value. But > the remaining 0.1% causes headaches for user-space applications, which > can't make solid assumptions about the rules. Thus, in a way, the > flexibility of the rules does more harm than it helps. > > > We could do that in the kernel too, of course, but I'm afraid what > > the > > resulting BLIST changes would end up looking like over time. > > That's something we want to avoid, sure. > > But we can actually combine both approaches. If "wwid" yields a good > value most of the time (which is true IMO), we could make user space > rely on it by default, and make it possible to set an udev property > (e.g. ENV{ID_LEGACY}="1") to tell udev rules to determine WWID > differently. User-space apps like multipath could check the ID_LEGACY > property to determine whether or not reading the "wwid" attribute would > be consistent with udev. That would simplify matters a lot for us (Ben, > do you agree?), without the need of adding endless BLIST entries. > Yeah, as long as ID_LEGACY was changed in a careful manner, so WWIDs didn't simply change without warning because of an upgrade, a path out of this complexity is a definitely helpful. -Ben > > > I am also very concerned about changing what the kernel currently > > exports in a given variable like "wwid". A seemingly innocuous change > > to > > the reported value could lead to a system no longer booting after > > updating the kernel. > > AFAICT, no major distribution uses "wwid" for this purpose (yet). I > just recently realized that the kernel's ALUA code refers to it. (*) > > In a recent discussion with Hannes, the idea came up that the priority > of "SCSI name string" designators should actually depend on their > subtype. "naa." name strings should map to the respective NAA > descriptors, and "eui.", likewise (only "iqn." descriptors have no > binary counterpart; we thought they should rather be put below NAA, > prio-wise). > > I wonder if you'd agree with a change made that way for "wwid". I > suppose you don't. I'd then propose to add a new attribute following > this logic. It could simply be an additional attribute with a different > name. Or this new attribute could be a property of the block device > rather than the SCSI device, like NVMe does it > (/sys/block/nvme0n2/wwid). > > I don't like the idea of having separate sysfs attributes for > designators of different types, that's impractical for user space. > > > But taking a step back: Other than "it's not what userland currently > > does", what specifically is the problem with designator_prio()? We've > > picked the priority list once and for all. If we promise never to > > change > > it, what is the issue? > > If the prioritization in kernel and user space was the same, we could > migrate away from udev more easily without risking boot failure. > > Thanks, > Martin > > (*) which is an argument for using "wwid" in user space too - just to > be consitent with the kernel's internal logic. > > -- > Dr. Martin Wilck <mwilck@xxxxxxxx>, Tel. +49 (0)911 74053 2107 > SUSE Software Solutions Germany GmbH > HRB 36809, AG Nürnberg GF: Felix Imendörffer >