On Mon, 27 Nov 2023 10:59:36 +0000, Ganapatrao Kulkarni <gankulkarni@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > > > On 27-11-2023 02:52 pm, Marc Zyngier wrote: > > On Mon, 27 Nov 2023 07:26:58 +0000, > > Ganapatrao Kulkarni <gankulkarni@xxxxxxxxxxxxxxxxxxxxxx> wrote: > >> > >> > >> > >> On 24-11-2023 08:02 pm, Marc Zyngier wrote: > >>> On Fri, 24 Nov 2023 13:22:22 +0000, > >>> Ganapatrao Kulkarni <gankulkarni@xxxxxxxxxxxxxxxxxxxxxx> wrote: > >>>> > >>>>> How is this value possible if the write to HCR_EL2 has taken place? > >>>>> When do you sample this? > >>>> > >>>> I am not sure how and where it got set. I think, whatever it is set, > >>>> it is due to false return of vcpu_el2_e2h_is_set(). Need to > >>>> understand/debug. > >>>> The vhcr_el2 value I have shared is traced along with hcr in function > >>>> __activate_traps/__compute_hcr. > >>> > >>> Here's my hunch: > >>> > >>> The guest boots with E2H=0, because we don't advertise anything else > >>> on your HW. So we run with NV1=1 until we try to *upgrade* to VHE. NV2 > >>> means that HCR_EL2 is writable (to memory) without a trap. But we're > >>> still running with NV1=1. > >>> > >>> Subsequently, we access a sysreg that should never trap for a VHE > >>> guest, but we're with the wrong config. Bad things happen. > >>> > >>> Unfortunately, NV2 is pretty much incompatible with E2H being updated, > >>> because it cannot perform the changes that this would result into at > >>> the point where they should happen. We can try and do a best effort > >>> handling, but you can always trick it. > >>> > >>> Anyway, can you see if the hack below helps? I'm not keen on it at > >>> all, but this would be a good data point. > >> > >> Thanks Marc, this diff fixes the issue. > >> Just wondering what is changed w.r.t to L1 handling from V10 to V11 > >> that it requires this trick? > > > > Not completely sure. Before v11, anything that would trap would be > > silently handled by the FEAT_NV code. Now, a trap for something that > > is supposed to be redirected to VNCR results in an UNDEF exception. > > > > I suspect that the exception is handled again as a call to > > __finalise_el2(), probably because the write to VBAR_EL1 didn't do > > what it was supposed to do. > > > >> Also why this was not seen on your platform, is it E2H0 enabled? > > > > It doesn't have FEAT_E2H0, and that's the whole point. No E2H0, no > > problems, as the guest cannot trick the host into losing track of the > > state (which I'm pretty sure can happen even with this ugly hack). > > > > I will probably completely disable NV1 support in the next drop, and > > make NV support only VHE guests. Which is the only mode that makes any > > sense anyway. > > > > Thanks, absolutely makes sense to have *VHE-only* L1, looking forward > to a next drop. Note that this won't be restricted to L1, but will affect *everything. No non-VHE guest will be supported at any level whatsoever, and NV will always expose ID_AA64MMFR4_EL1.E2H0=0b1110, indicating that HCR_EL2.NV1 is RES0, on top of ID_AA64MMFR4_EL1.NV_frac=1 (NV2 only). M. -- Without deviation from the norm, progress is not possible.