Re: [Patch] memory: tegra: Skip SID override from Guest VM

Marc Zyngier <maz@xxxxxxxxxx> · Tue, 06 Feb 2024 14:54:12 +0000

On Tue, 06 Feb 2024 14:07:10 +0000,
"Thierry Reding" <thierry.reding@xxxxxxxxx> wrote:
> 
> [1  <text/plain; UTF-8 (quoted-printable)>]
> On Tue Feb 6, 2024 at 1:53 PM CET, Marc Zyngier wrote:
> > On Tue, 06 Feb 2024 12:28:27 +0000, Jon Hunter <jonathanh@xxxxxxxxxx> wrote:
> > > On 06/02/2024 12:17, Marc Zyngier wrote:
> [...]
> > > > - My own tegra186 HW doesn't have VHE, since it is ARMv8.0, and this
> > > >    helper will always return 'false'. How could this result in
> > > >    something that still works? Can I get a free CPU upgrade?
> > > 
> > > I thought this API just checks to see if we are in EL2?
> >
> > It does. And that's the problem. On ARMv8.0, we run the Linux kernel
> > at EL1. Tegra186 is ARMv8.0 (Denver + A57). So as written, this change
> > breaks the very platform it intends to support.
> 
> To clarify, the code that accesses these registers is shared across
> Tegra186 and later chips. Tegra194 and later do support ARMv8.1 VHE.

But even on these machines that are VHE-capable, not running at EL2
doesn't mean we're running as a guest. The user can force the kernel
to stick to EL1, using a command-line option such as kvm-arm.mode=nvhe
which will force the kernel to stay at EL1 while deploying KVM at EL2.

> Granted, if it always returns false on Tegra186 that's not what we
> want.

I'm glad we agree here.

> > > > - If you assign this device to a VM and that the hypervisor doesn't
> > > >    correctly virtualise it, then it is a different device and you
> > > >    should simply advertise it something else. Or even better, fix your
> > > >    hypervisor.
> > > 
> > > Sumit can add some more details on why we don't completely disable the
> > > device for guest OSs.
> >
> > It's not about disabling it. It is about correctly supporting it
> > (providing full emulation for it), or advertising it as something
> > different so that SW can handle it differently.
> 
> It's really not a different device. It's exactly the same device except
> that accessing some registers isn't permitted. We also can't easily
> remove parts of the register region from device tree because these are
> intermixed with other registers that we do want access to.

But that's the definition of being a different device. It has a
different programming interface, hence it is different. The fact that
it is the same HW block mediated by a hypervisor doesn't really change
that.

> > Poking into the internals of how the kernel is booted for a driver
> > that isn't tied to the core architecture (because it would need to
> > access system registers, for example) is not an acceptable outcome.
> 
> So what would be the better option? Use a different compatible string to
> make the driver handle the device differently? Or adding a custom
> property to the device tree node to mark this as running in a
> virtualized environment?

A different compatible string would be my preferred option. An extra
property would work as well. As far as I am concerned, these two
options are the right way to express the fact that you have something
that isn't quite like the real thing.

> Perhaps we can reuse the top-level hypervisor node? That seems to only
> ever have been used for Xen on 32-bit ARM, so not sure if that'd still
> be appropriate.

I'd shy away from this. You would be deriving properties from a
hypervisor implementation, instead of expressing those properties
directly. In my experience, the direct method is always preferable.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.