On Thu, Jan 31, 2019 at 09:40:02AM +0000, Julien Thierry wrote: > > > On 31/01/2019 09:27, Christoffer Dall wrote: > > On Thu, Jan 31, 2019 at 08:56:04AM +0000, Julien Thierry wrote: > >> > >> > >> On 31/01/2019 08:19, Christoffer Dall wrote: > >>> On Mon, Jan 28, 2019 at 03:42:42PM +0000, Julien Thierry wrote: > >>>> Hi James, > >>>> > >>>> On 28/01/2019 11:48, James Morse wrote: > >>>>> Hi Julien, > >>>>> > >>>>> On 21/01/2019 15:33, Julien Thierry wrote: > >>>>>> When using VHE, the host needs to clear HCR_EL2.TGE bit in order > >>>>>> to interract with guest TLBs, switching from EL2&0 translation regime > >>>>> > >>>>> (interact) > >>>>> > >>>>> > >>>>>> to EL1&0. > >>>>>> > >>>>>> However, some non-maskable asynchronous event could happen while TGE is > >>>>>> cleared like SDEI. Because of this address translation operations > >>>>>> relying on EL2&0 translation regime could fail (tlb invalidation, > >>>>>> userspace access, ...). > >>>>>> > >>>>>> Fix this by properly setting HCR_EL2.TGE when entering NMI context and > >>>>>> clear it if necessary when returning to the interrupted context. > >>>>> > >>>>> Yes please. This would not have been fun to debug! > >>>>> > >>>>> Reviewed-by: James Morse <james.morse@xxxxxxx> > >>>>> > >>>>> > >>>> > >>>> Thanks. > >>>> > >>>>> > >>>>> I was looking for why we need core code to do this, instead of updating the > >>>>> arch's call sites. Your 'irqdesc: Add domain handlers for NMIs' patch (pointed > >>>>> to from the cover letter) is the reason: core-code calls nmi_enter()/nmi_exit() > >>>>> itself. > >>>>> > >>>> > >>>> Yes, that's the main reason. > >>>> > >>> I wondered the same thing, but I don't understand the explanation :( > >>> > >>> Why can't we do a local_daif_mask() around the (very small) calls that > >>> clear TGE instead? > >>> > >> > >> That would protect against the pseudo-NMIs, but you can still get an > >> SDEI at that point even with all daif bits set. Or did I misunderstand > >> how SDEI works? > >> > > > > I don't know the details of SDEI. From looking at this patch, the > > logical conclusion would be that SDEIs can then only be delivered once > > we've called nmi_enter, but since we don't call this directly from the > > code that clears TGE for doing guest TLB invalidation (or do we?) then > > masking interrupts at the PSTATE level should be sufficient. > > > > Surely I'm missing some part of the bigger picture here. > > > > I'm not sure I understand. SDEI uses the NMI context and AFAIU, it is an > interrupt that the firmware sends to the OS, and it is sent regardless > of the PSTATE at the OS EL. > > So, the worrying part is: > - Hyp clears TGE > - Exception/interrupt taken to EL3 > - Firmware decides it's a good time to send an SDEI to the OS > - SDEI handler (at EL2 for VHE) does nmi_enter() > - SDEI handler needs to do cache invalidation or something with the > EL2&0 translation regime but TGE is cleared > > We don't expect the code that clears TGE to call nmi_enter(). > You do understand :) I didn't understand that the SDEI handler calls nmi_enter() -- and to be fair the commit message didn't really provide that link -- but it makes perfect sense now. I naively thought that SDEI had respected the pstate bits setting before, and that this was becoming a problem with the introduction of pseudo-NMIs, but I clearly came at this from the wrong direction. Thanks for the explanation! Christoffer