> On 16 Jul 2019, at 19:10, Singh, Brijesh <brijesh.singh@xxxxxxx> wrote: > > > > On 7/16/19 10:56 AM, Liran Alon wrote: >> >> >>> On 16 Jul 2019, at 18:48, Singh, Brijesh <brijesh.singh@xxxxxxx> wrote: >>> >>> On 7/15/19 3:30 PM, Liran Alon wrote: >>>> According to AMD Errata 1096: >>>> "On a nested data page fault when CR4.SMAP = 1 and the guest data read generates a SMAP violation, the >>>> GuestInstrBytes field of the VMCB on a VMEXIT will incorrectly return 0h instead the correct guest instruction >>>> bytes." >>>> >>>> As stated above, errata is encountered when guest read generates a SMAP violation. i.e. vCPU runs >>>> with CPL<3 and CR4.SMAP=1. However, code have mistakenly checked if CPL==3 and CR4.SMAP==0. >>>> >>> >>> The SMAP violation will occur from CPL3 so CPL==3 is a valid check. >>> >>> See [1] for complete discussion >>> >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kernel.org_patch_10808075_-2322479271&d=DwIGaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=Jk6Q8nNzkQ6LJ6g42qARkg6ryIDGQr-yKXPNGZbpTx0&m=RAt8t8nBaCxUPy5OTDkO0n8BMQ5l9oSfLMiL0TLTu6c&s=Nkwe8rTJhygBCIPz27LXrylptjnWyMwB-nJaiowWpWc&e= >> >> I still don’t understand. SMAP is a mechanism which is meant to protect a CPU running in CPL<3 from mistakenly referencing data controllable by CPL==3. >> Therefore, SMAP violation should be raised when CPL<3 and data referenced is mapped in page-tables with PTE with U/S bit set to 1. (i.e. User accessible). >> >> Thus, we should check if CPL<3 and CR4.SMAP==1. >> > > In this particular case we are dealing with NPF and not SMAP fault per > say. > > What typically has happened here is: > > - user space does the MMIO access which causes a fault > - hardware processes this as a VMEXIT > - during processing, hardware attempts to read the instruction bytes to > provide decode assist. This is typically done by data read request from > the RIP that the guest was at. While doing so, we may hit SMAP fault How can a SMAP fault occur when CPL==3? One of the conditions for SMAP is that CPL<3. I think the confusion is that I believe a code mapped as user-accessible in page-tables but runs with CPL<3 should be the one which does the MMIO. Rather then code running in CPL==3. The sequence of events I imagine to trigger the Errata is as follows: 1) Guest maps code in page-tables as user-accessible (i.e. PTE with U/S bit set to 1). 2) Guest executes this code with CPL<3 (even though mapped as user-accessible which is a security vulnerability in itself…) which access data that is not mapped or marked as reserved in NPT and therefore cause #NPF. 3) Physical CPU DecodeAssist feature attempts to fill-in guest instruction bytes. So it reads as data the guest instructions while CPU is currently with CPL<3, CR4.SMAP=1 and code is mapped as user-accessible. Therefore, this fill-in raise a SMAP violation which cause #NPF to be raised to KVM with 0 instruction bytes. BTW, this also means that in order to trigger this, CR4.SMEP should be set to 0. As otherwise, instruction couldn’t have been executed to raise #NPF in the first place. Maybe we can add this as another condition to recognise the Errata? -Liran > because internally CPU is doing a data read from the RIP to get those > instruction bytes. Since it hit the SMAP fault hence it was not able > to decode the instruction to provide the insn_len. So we are first > checking if it was a fault caused from CPL==3 and SMAP is enabled. > If so, we are hitting this errata and it can be workaround. > > -Brijesh > > >