On Tue, 06 Feb 2024 14:07:10 +0000, "Thierry Reding" <thierry.reding@xxxxxxxxx> wrote: > > [1 <text/plain; UTF-8 (quoted-printable)>] > On Tue Feb 6, 2024 at 1:53 PM CET, Marc Zyngier wrote: > > On Tue, 06 Feb 2024 12:28:27 +0000, Jon Hunter <jonathanh@xxxxxxxxxx> wrote: > > > On 06/02/2024 12:17, Marc Zyngier wrote: > [...] > > > > - My own tegra186 HW doesn't have VHE, since it is ARMv8.0, and this > > > > helper will always return 'false'. How could this result in > > > > something that still works? Can I get a free CPU upgrade? > > > > > > I thought this API just checks to see if we are in EL2? > > > > It does. And that's the problem. On ARMv8.0, we run the Linux kernel > > at EL1. Tegra186 is ARMv8.0 (Denver + A57). So as written, this change > > breaks the very platform it intends to support. > > To clarify, the code that accesses these registers is shared across > Tegra186 and later chips. Tegra194 and later do support ARMv8.1 VHE. But even on these machines that are VHE-capable, not running at EL2 doesn't mean we're running as a guest. The user can force the kernel to stick to EL1, using a command-line option such as kvm-arm.mode=nvhe which will force the kernel to stay at EL1 while deploying KVM at EL2. > Granted, if it always returns false on Tegra186 that's not what we > want. I'm glad we agree here. > > > > - If you assign this device to a VM and that the hypervisor doesn't > > > > correctly virtualise it, then it is a different device and you > > > > should simply advertise it something else. Or even better, fix your > > > > hypervisor. > > > > > > Sumit can add some more details on why we don't completely disable the > > > device for guest OSs. > > > > It's not about disabling it. It is about correctly supporting it > > (providing full emulation for it), or advertising it as something > > different so that SW can handle it differently. > > It's really not a different device. It's exactly the same device except > that accessing some registers isn't permitted. We also can't easily > remove parts of the register region from device tree because these are > intermixed with other registers that we do want access to. But that's the definition of being a different device. It has a different programming interface, hence it is different. The fact that it is the same HW block mediated by a hypervisor doesn't really change that. > > Poking into the internals of how the kernel is booted for a driver > > that isn't tied to the core architecture (because it would need to > > access system registers, for example) is not an acceptable outcome. > > So what would be the better option? Use a different compatible string to > make the driver handle the device differently? Or adding a custom > property to the device tree node to mark this as running in a > virtualized environment? A different compatible string would be my preferred option. An extra property would work as well. As far as I am concerned, these two options are the right way to express the fact that you have something that isn't quite like the real thing. > Perhaps we can reuse the top-level hypervisor node? That seems to only > ever have been used for Xen on 32-bit ARM, so not sure if that'd still > be appropriate. I'd shy away from this. You would be deriving properties from a hypervisor implementation, instead of expressing those properties directly. In my experience, the direct method is always preferable. Thanks, M. -- Without deviation from the norm, progress is not possible.