Here is a thread.. but more recent is available https://marc.info/?t=156322283300001&r=1&w=2 Paolo, Sean and others have also replied to it which you can see on marc.info. -Brijesh On 7/16/19 11:20 AM, Liran Alon wrote: > > >> On 16 Jul 2019, at 19:10, Singh, Brijesh <brijesh.singh@xxxxxxx> wrote: >> >> >> >> On 7/16/19 10:56 AM, Liran Alon wrote: >>> >>> >>>> On 16 Jul 2019, at 18:48, Singh, Brijesh <brijesh.singh@xxxxxxx> wrote: >>>> >>>> On 7/15/19 3:30 PM, Liran Alon wrote: >>>>> According to AMD Errata 1096: >>>>> "On a nested data page fault when CR4.SMAP = 1 and the guest data read generates a SMAP violation, the >>>>> GuestInstrBytes field of the VMCB on a VMEXIT will incorrectly return 0h instead the correct guest instruction >>>>> bytes." >>>>> >>>>> As stated above, errata is encountered when guest read generates a SMAP violation. i.e. vCPU runs >>>>> with CPL<3 and CR4.SMAP=1. However, code have mistakenly checked if CPL==3 and CR4.SMAP==0. >>>>> >>>> >>>> The SMAP violation will occur from CPL3 so CPL==3 is a valid check. >>>> >>>> See [1] for complete discussion >>>> >>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kernel.org_patch_10808075_-2322479271&d=DwIGaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=Jk6Q8nNzkQ6LJ6g42qARkg6ryIDGQr-yKXPNGZbpTx0&m=RAt8t8nBaCxUPy5OTDkO0n8BMQ5l9oSfLMiL0TLTu6c&s=Nkwe8rTJhygBCIPz27LXrylptjnWyMwB-nJaiowWpWc&e= >>> >>> I still don’t understand. SMAP is a mechanism which is meant to protect a CPU running in CPL<3 from mistakenly referencing data controllable by CPL==3. >>> Therefore, SMAP violation should be raised when CPL<3 and data referenced is mapped in page-tables with PTE with U/S bit set to 1. (i.e. User accessible). >>> >>> Thus, we should check if CPL<3 and CR4.SMAP==1. >>> >> >> In this particular case we are dealing with NPF and not SMAP fault per >> say. >> >> What typically has happened here is: >> >> - user space does the MMIO access which causes a fault >> - hardware processes this as a VMEXIT >> - during processing, hardware attempts to read the instruction bytes to >> provide decode assist. This is typically done by data read request from >> the RIP that the guest was at. While doing so, we may hit SMAP fault > > How can a SMAP fault occur when CPL==3? One of the conditions for SMAP is that CPL<3. > > I think the confusion is that I believe a code mapped as user-accessible in page-tables but runs with CPL<3 > should be the one which does the MMIO. Rather then code running in CPL==3. > > The sequence of events I imagine to trigger the Errata is as follows: > 1) Guest maps code in page-tables as user-accessible (i.e. PTE with U/S bit set to 1). > 2) Guest executes this code with CPL<3 (even though mapped as user-accessible which is a security vulnerability in itself…) which access data that is not mapped or marked as reserved in NPT and therefore cause #NPF. > 3) Physical CPU DecodeAssist feature attempts to fill-in guest instruction bytes. So it reads as data the guest instructions while CPU is currently with CPL<3, CR4.SMAP=1 and code is mapped as user-accessible. Therefore, this fill-in raise a SMAP violation which cause #NPF to be raised to KVM with 0 instruction bytes. > > BTW, this also means that in order to trigger this, CR4.SMEP should be set to 0. As otherwise, instruction couldn’t have been executed to raise #NPF in the first place. Maybe we can add this as another condition to recognise the Errata? > > -Liran > >> because internally CPU is doing a data read from the RIP to get those >> instruction bytes. Since it hit the SMAP fault hence it was not able >> to decode the instruction to provide the insn_len. So we are first >> checking if it was a fault caused from CPL==3 and SMAP is enabled. >> If so, we are hitting this errata and it can be workaround. >> >> -Brijesh >> >> >> >