Jason Gunthorpe wrote: [..] > > I am warming to your assertion that there is a wide array of > > vendor-specific configuration and debug that are not an efficient use of > > upstream's time to wrap in a shared Linux ABI. I want to explore fwctl > > for CXL for that use case, I personally don't want to marshal a Linux > > command to each vendor's slightly different backend CXL toggles. > > Personally I think this idea to marshal/unmarshal everything in the > kernel is often misguided. If it is truely obvious and actually shared > multi-vendor capability then by all means go and do it. > > But if you are spending weeks/months fighting about uAPI because all > the vendors are so different, it isn't obvious what is "generic" then > you've probably already lost. The very worst outcome is a per-device > uAPI masquerading as an obfuscated "generic" uAPI that wasted ages of > peoples time to argue out. Certainly once you have gotten to the "months of arguing" point it begs the question "was there really any generic benefit to reap in the first place?" That said, *some* grappling, especially when muliple vendors hit the list with the similar feature at the same time, has yielded collaboration in the past. So I might be a few rungs back on the spectrum from where you are, but I concede that yes, there is a point of diminishing to negative returns. > > At the same time, I also agree with the contention that a "do anything > > you want and get away with it" tunnel invites shenanigans from folks > > that may not care about the long term health of the Linux kernel vs > > their short term interests. > > IMHO this is disproven by history. The above mstflint I linked to is > as old as as mlx5 HW, it runs today over PCI config space and an OOT > driver. Where is real the damage to the long term health of Linux or > the ecosystem? > > Like I said before I view there is a difference between DRM wanting a > Vulkan stack and doing some device specific > configuration/debugging. One has vastly more open source value than > the other. Fair. > > So my questions to try to understand the specific sticking points more > > are: > > > > 1/ Can you think of a Command Effect that the device could enumerate to > > address the specific shenanigan's that netdev is worried about? > > Nothing comes to mind.. Ugh, that indeed seems too severe. > > In other words if every command a device enables has the stated > > effect of "Configuration Change after Reset" does that cut out a > > significant portion of the concern? > > In other words if every command a device enables has the stated > > effect of "Configuration Change after Reset" does that cut out a > > significant portion of the concern? > > Related to configuration - one of Saeed's oringinal ideas was to > way that mlx5 could implement all of its options, ideally with > configurables discovered dynamically from the running device. This LPC > presentation was so agressively rejected by Jakub that Saeed abandoned > it. In the discussion it was clear Jakub is requesting to review and > possibly reject every configurable. > between "netdev is the gatekeeper for all FLASH configurables" and > "devices can be fully configured regardless of their design". This gets back to the unspoken conceit of the kernel restriction that I mentioned earlier. At some point the kernel restriction begets a cynical in-tree workaround or an out-of-tree workaround which either way means upstream Linux loses. > > 2/ About the "what if the device lies?" question. We can't revert code > > that used to work, but we can definitely work with enterprise distros to > > turn off fwctl where there is concern it may lead or is leading to > > shenanigans. > > Security is the one place where Linus has tolerated userspace > regressions. In this specific case I documented (or at least that was > the intent) there would be regression consequences to breaking the > security rules. Commands can be retroactively restricted to higher CAP > levels and rejected from lockdown if the device attracts a CVE. > > IMHO the ecosystem is strongly motived to do security seriously these > days, I am not so worried. That is a good point, if a Command Effect gets tied to a CVE, or a cynical workaround gets tied to a CVE, both of those demand an upstream and distro response. > > So, document what each subsystem's stance towards fwctl is, > > like maybe a distro only wants fwctl to front publicly documented vendor > > commands, or maybe private vendor commands ok, but only with a > > constrained set of Command Effects (I potentially see CXL here). > > I wouldn't say subsystem here, but techonology. I think it is > reasonable that a CXL fwctl driver have some kconfig tunables like you > already have. This idea works alot better if the underlying thing is > already standards based. True, I worry about these technologies that cross upstream maintainer boundaries. When you have a composable switch that enables net, block, and/or mem use cases, which upstream maintainer policy applies to the fwctl posture of that thing?