On Wed, Nov 06, 2024 at 09:05:26PM +0000, Robin Murphy wrote: > > Each future capability we want to enable at the VMM needs an analysis: > > > > 1) Does it require kernel SW changes, ie like BTM? Then it needs a > > kernel_capabilities bit to say the kernel SW exists > > 2) Does it require data from ACPI/DT/etc? Then it needs a > > kernel_capabilities bit > > 3) Does it need to be "turned on" per VM, ie with a VMS enablement? > > Then it needs a new request flag in ALLOC_VIOMMU > > 4) Otherwise it can be read directly from the idr[] array > > > > This is why the comment above is so stern that the VMM "should only > > forward bits that it knows it can support". > > So... you're saying this patch is in fact broken, or at least uselessly > incomplete, since VMMs aren't allowed to emulate a vSMMU at all without > first consulting some other interface which does not exist? Great. That is too far, there is nothing fundamentally broken here. This does not enable every SMMU feature in the architecture. It implements a basic baseline and no more. The VMMs are only permitted to read bits in that baseline. VMMs that want to go beyond that have to go through the 4 steps above. Fundamentally all we are arguing about here is if bits the VMM shouldn't even read should be forced to zero by the kernel. If the bits are forced to zero then perhaps they could be used as a future SW feature negotiation, and maybe that would be better, or maybe not. See below about BTM for a counter example. I agree the kdoc does not describe what the baseline actually is. > We know about that in the kernel, but all a VMM sees is > iommu_hw_info_arm_smmuv3.idr[0] indicating HTTU=2. If it is "broken" to take > the only information available at face value, assume HTTU is available, and > reflect that in a vSMMU interface, then what is the correct thing to do, > other than to not dare emulate a vSMMU at all, in fear of a sternly worded > comment? These patches support a baseline feature set. They do not support the entire SMMU architecture. They do not support vHTTU. Things are going step by step because this is a huge project. HTTU is not in the baseline, so it is definitely wrong for any VMM working on these patches to advertise support to the VM! The VMM *MUST* ignore such bits in the IDR report. The correct thing is the 4 steps I outlined above. When a VMM author wants to go beyond the baseline they have to determine what kernel and VMM software is required and implement the missing parts. > Your tautology still does not offer any reasoning against doing the logical > thing and following the same basic pattern: the kernel uses the ID register > mechanism itself to advertise the set of features it's able/willing to > support, by sanitising the values it offers to the VMM, combining the > notions of hardware and kernel support where the distinction is irrelevant > anyway. Even with your plan the VMM can not pass the kernel's IDR register into the VM unprotected. It must always sanitize it against the features that the VMM supports. Let's consider vBTM support. This requires kernel support to enable the global BTM register *AND* a new VMM ioctl flow with iommufd to link to KVM's VMID. You are suggesting that the old kernel should wire IDR BTM to 0 and the new kernel will set it to 1. However the VMM cannot just forward this bit to the VM! A old VMM that does not do the KVM flow does NOT support BTM. All VMM's must mask the BTM bit from day 0, or we can never set it to 1 in a future kernel. Which is exactly the same plan as this patch! > You should take "We discussed this already" > as more of a clue to yourself than to me - if 4 different people have all > said the exact same thing in so many words, perhaps there's something in > it... And all seemed to agree it was not a big deal after the discussion. I think Mostafa was driving in a direction that we break up the IDR into explicit fields and thus be explicit about what information the VMM is able to access. This would effectively document and enforce what the baseline is. > And in case I need to spell it out with less sarcasm, "we'll get masking > wrong in the kernel" only implies "we'll get kernel_capabilities wrong in > the kernel (and elsewhere)", Less sarcasm and hyperbole would be nice. > > If you still feel strongly about this please let me know by Friday and > > I will drop the idr[] array from this cycle. We can continue to > > discuss a solution for the next cycle. > > It already can't work as-is, As above, this is not true. > I don't see how making it even more broken > would help. IMO it doesn't seem like a good idea to be merging UAPI at all > while it's still clearly incomplete and by its own definition unusable. I agree that removing the idr would make it definitely unusable. There is quite a bit of essential information in there the VMM needs. However, the uAPI is extensible so adding a new field in the next cycle is not breaking. I'm trying to offer you an olive branch so we can have this discussion with time instead of urgently at the 11th hour and derailing the series. This exact topic has been discussed already two months ago, and is, IMHO, mostly a style choice not an urgent technical matter. If you have another option we can conclude in the next few days then I'm happy to hear it. If we focus only on the baseline functionality Nicolin can come up with a list of fields and we can write it very explicitly in the kdoc that the VMM can only read those fields. I could also reluctantly agree to permanent masking to report only the above kdoc bits in the kernel. Noting if you want this because you don't trust the VMM authors to follow an explicit kdoc, then we likely also cannot ever set some bits to 1. Eg BTM would be permanently 0. Regards, Jason