> -----Original Message----- > From: Paolo Bonzini [mailto:paolo.bonzini@xxxxxxxxx] On Behalf Of Paolo > Bonzini > Sent: Friday, March 28, 2014 8:03 PM > To: Wu, Feng; gleb@xxxxxxxxxx; hpa@xxxxxxxxx; kvm@xxxxxxxxxxxxxxx > Subject: Re: [PATCH 2/4] KVM: Add SMAP support when setting CR4 > > Il 28/03/2014 18:36, Feng Wu ha scritto: > > + smap = kvm_read_cr4_bits(vcpu, X86_CR4_SMAP); > > You are overwriting this variable below, but that is not okay because > the value of CR4 must be considered separately in each iteration. This > also hides a uninitialized-variable bug for "smap" correctly in the EPT > case. > > To avoid that, rename this variable to cr4_smap; it's probably better > to rename smep to cr4_smep too. > > > for (byte = 0; byte < ARRAY_SIZE(mmu->permissions); ++byte) { > > pfec = byte << 1; > > map = 0; > > wf = pfec & PFERR_WRITE_MASK; > > uf = pfec & PFERR_USER_MASK; > > ff = pfec & PFERR_FETCH_MASK; > > + smapf = pfec & PFERR_RSVD_MASK; > > The reader will expect PFERR_RSVD_MASK to be zero here. So please > add a comment: /* PFERR_RSVD_MASK is set in pfec if ... */". > > > for (bit = 0; bit < 8; ++bit) { > > x = bit & ACC_EXEC_MASK; > > w = bit & ACC_WRITE_MASK; > > @@ -3627,11 +3629,27 @@ static void update_permission_bitmask(struct > kvm_vcpu *vcpu, > > w |= !is_write_protection(vcpu) && !uf; > > /* Disallow supervisor fetches of user code if cr4.smep */ > > x &= !(smep && u && !uf); > > + > > + /* > > + * SMAP:kernel-mode data accesses from user-mode > > + * mappings should fault. A fault is considered > > + * as a SMAP violation if all of the following > > + * conditions are ture: > > + * - X86_CR4_SMAP is set in CR4 > > + * - An user page is accessed > > + * - Page fault in kernel mode > > + * - !(CPL<3 && X86_EFLAGS_AC is set) > > + * > > + * Here, we cover the first three conditions, > > + * we need to check CPL and X86_EFLAGS_AC in > > + * permission_fault() dynamiccally > > "dynamically". These three lines however are not entirely correct. We do > cover the last condition here, it is in smapf. So perhaps something like > > * Here, we cover the first three conditions. > * The CPL and X86_EFLAGS_AC is in smapf, which > * permission_fault() computes dynamically. > > > + */ > > + smap = smap && smapf && u && !uf; > > SMAP does not affect instruction fetches. Do you need "&& !ff" here? > Perhaps > it's clearer to add it even if it is not strictly necessary. > > Please write just "smap = cr4_smap && u && !uf && !ff" here, and add back > smapf below > in the assignment to "fault". This makes the code more homogeneous. > > > } else > > /* Not really needed: no U/S accesses on ept */ > > u = 1; > > - fault = (ff && !x) || (uf && !u) || (wf && !w); > > + fault = (ff && !x) || (uf && !u) || (wf && !w) || smap; > > ... > > > + > > + /* > > + * If CPL < 3, SMAP protections are disabled if EFLAGS.AC = 1. > > + * > > + * If CPL = 3, SMAP applies to all supervisor-mode data accesses > > + * (these are implicit supervisor accesses) regardless of the value > > + * of EFLAGS.AC. > > + * > > + * So we need to check CPL and EFLAGS.AC to detect whether there is > > + * a SMAP violation. > > + */ > > + > > + smapf = ((mmu->permissions[(pfec|PFERR_RSVD_MASK) >> 1] >> > pte_access) & > > + 1) && !((cpl < 3) && ((rflags & X86_EFLAGS_AC) == 1)); > > + > > + return ((mmu->permissions[pfec >> 1] >> pte_access) & 1) || smapf; > > You do not need two separate accesses. Just add PFERR_RSVD_MASK to pfec > if > the conditions for SMAP are satisfied. There are two possibilities: > > 1) setting PFERR_RSVD_MASK if SMAP is being enforced, that is if CPL = 3 > || AC = 0. This is what you are doing now. > > 2) setting PFERR_RSVD_MASK if SMAP is being overridden, that is if CPL < 3 > && AC = 1. You then have to invert the bit in update_permission_bitmask. > > Please consider both choices, and pick the one that gives better code. > > Also, this must be written in a branchless way. Branchless tricks are common > throughout the MMU code because the hit rate of most branches is pretty > much > 50%-50%. This is also true in this case, at least if SMAP is in use (if it > is not in use, we'll have AC=0 most of the time). > > I don't want to spoil the fun, but I don't want to waste your time either > so I rot13'ed my solution and placed it after the signature. ;) > > As to nested virtualization, I reread the code and it should already work, > because it sets PFERR_USER_MASK. This means uf=1, and a SMAP fault will > never trigger with uf=1. > > Thanks for following this! Please include "v3" in the patch subject on > your next post! > > Paolo > > ------------------------------------- 8< -------------------------------------- > Nqq qrsvavgvbaf sbe CSREE_*_OVG (0 sbe cerfrag, 1 sbe jevgr, rgp.) naq > hfr gur sbyybjvat: > > vag vaqrk, fznc; > > /* > * Vs PCY < 3, FZNC cebgrpgvbaf ner qvfnoyrq vs RSYNTF.NP = 1. > * > * Vs PCY = 3, FZNC nccyvrf gb nyy fhcreivfbe-zbqr qngn npprffrf > * (gurfr ner vzcyvpvg fhcreivfbe npprffrf) ertneqyrff bs gur inyhr > * bs RSYNTF.NP. > * > * Guvf pbzchgrf (pcy < 3) && (esyntf & K86_RSYNTF_NP), yrnivat > * gur erfhyg va K86_RSYNTF_NP. Jr gura vafreg vg va cynpr > * bs gur CSREE_EFIQ_ZNFX ovg; guvf ovg jvyy nyjnlf or mreb va csrp, > * ohg vg jvyy or bar va vaqrk vs FZNC purpxf ner orvat bireevqqra. > * Vg vf vzcbegnag gb xrrc guvf oenapuyrff. > */ > fznc = (pcy - 3) & (esyntf & K86_RSYNTF_NP); > vaqrk = > (csrp >> 1) + > (fznc >> (K86_RSYNTF_NP_OVG - CSREE_EFIQ_OVG + 1)); > > erghea (zzh->crezvffvbaf[vaqrk] >> cgr_npprff) & 1; > > Gur qverpgvba bs CSREE_EFIQ_ZNFX vf gur bccbfvgr pbzcnerq gb lbhe pbqr. I find that the above code is unreadable, I tried to translate it and here is the result: (If I made mistakes while this transition, please correct me!) -------------------------------------------------------------------------------------------------------------------- /* * If CPL < 3, SMAP protections are disabled if EFLAGS.AC = 1. * * If CPL = 3, SMAP applies to all supervisor-mode data accesses * (these are implicit supervisor accesses) regardless of the value * of EFLAGS.AC. * * This computes (cpl < 3) && (rflags & X86_EFLAGS_AC), leaving * the result in X86_EFLAGS_AC. We then insert it in place of * the PFERR_RSVD_MASK bit; this bit will always be zero in pfec, * but it will be one in index if SMAP checks are being overridden. * It is important to keep this branchless. */ smap = (cpl - 3) & (rflags & X86_EFLAGS_AC); index = (pfec >> 1) + (smap >> (X86_EFLAGS_AC_BIT - PFERR_RSVD_BIT + 1)); return (mmu->permissions[index] >> pte_access) & 1; The direction of PFERR_RSVD_MASK is the opposite compared to your code. ------------------------------------------------------------------------------------------------------------------- I am a little confused about some points of the above code: 1. "smap = (cpl - 3) & (rflags & X86_EFLAGS_AC);" "smap" equals 1 when it is overridden and it is 0 when being enforced. So "index" will be (pfec >> 1) when SMAP is enforced, but in my understanding of this case, we should use the index with PFERR_RSVD_MASK bit being 1 in mmu-> permissions[] to check the fault. 2. " smap >> (X86_EFLAGS_AC_BIT - PFERR_RSVD_BIT + 1)" I am not quite understand this line. BTW, I cannot find the definition of "PFERR_RSVD_BIT", Do you mean PFERR_RSVD_BIT equals 3? Here is my understanding: Basically, we can divide the array mmu->permissions[16] into two groups, one with PFERR_RSVD_BIT being 1 while the other with this bit being 0 like below: PFERR_RSVD_BIT: is 0 (group 0) is 1 (group 1) 0000 0100 0001 0100 0010 0110 0011 0111 1000 1100 1001 1101 1010 1110 1011 1111 I think the basic idea is using group 0 to check permission faults when !((cpl - 3) & (rflags & X86_EFLAGS_AC)), that is SMAP is overridden while using group 1 to check faults when (cpl - 3) & (rflags & X86_EFLAGS_AC), that is SMAP is enforced. Here is the code base on your proposal in my understanding: ------------------------------------------------------------------------------------------------------------------- smap = !((cpl - 3) & (rflags & X86_EFLAGS_AC)); index = (pfec >> 1) + (smap << (PFERR_RSVD_BIT - 1)); /*assuming PFERR_RSVD_BIT == 3*/ return (mmu->permissions[index] >> pte_access) & 1; ------------------------------------------------------------------------------------------------------------------- Could you please have a look at it? Appreciate your help! :) Thanks, Feng -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html