RE: [PATCH 2/4] KVM: Add SMAP support when setting CR4

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Paolo Bonzini [mailto:paolo.bonzini@xxxxxxxxx] On Behalf Of Paolo
> Bonzini
> Sent: Friday, March 28, 2014 8:03 PM
> To: Wu, Feng; gleb@xxxxxxxxxx; hpa@xxxxxxxxx; kvm@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH 2/4] KVM: Add SMAP support when setting CR4
> 
> Il 28/03/2014 18:36, Feng Wu ha scritto:
> > +	smap = kvm_read_cr4_bits(vcpu, X86_CR4_SMAP);
> 
> You are overwriting this variable below, but that is not okay because
> the value of CR4 must be considered separately in each iteration.  This
> also hides a uninitialized-variable bug for "smap" correctly in the EPT
> case.
> 
> To avoid that, rename this variable to cr4_smap; it's probably better
> to rename smep to cr4_smep too.
> 
> >  	for (byte = 0; byte < ARRAY_SIZE(mmu->permissions); ++byte) {
> >  		pfec = byte << 1;
> >  		map = 0;
> >  		wf = pfec & PFERR_WRITE_MASK;
> >  		uf = pfec & PFERR_USER_MASK;
> >  		ff = pfec & PFERR_FETCH_MASK;
> > +		smapf = pfec & PFERR_RSVD_MASK;
> 
> The reader will expect PFERR_RSVD_MASK to be zero here.  So please
> add a comment: /* PFERR_RSVD_MASK is set in pfec if ... */".
> 
> >  		for (bit = 0; bit < 8; ++bit) {
> >  			x = bit & ACC_EXEC_MASK;
> >  			w = bit & ACC_WRITE_MASK;
> > @@ -3627,11 +3629,27 @@ static void update_permission_bitmask(struct
> kvm_vcpu *vcpu,
> >  				w |= !is_write_protection(vcpu) && !uf;
> >  				/* Disallow supervisor fetches of user code if cr4.smep */
> >  				x &= !(smep && u && !uf);
> > +
> > +				/*
> > +				 * SMAP:kernel-mode data accesses from user-mode
> > +				 * mappings should fault. A fault is considered
> > +				 * as a SMAP violation if all of the following
> > +				 * conditions are ture:
> > +				 *   - X86_CR4_SMAP is set in CR4
> > +				 *   - An user page is accessed
> > +				 *   - Page fault in kernel mode
> > +				 *   - !(CPL<3 && X86_EFLAGS_AC is set)
> > +				 *
> > +				 *   Here, we cover the first three conditions,
> > +				 *   we need to check CPL and X86_EFLAGS_AC in
> > +				 *   permission_fault() dynamiccally
> 
> "dynamically".  These three lines however are not entirely correct.  We do
> cover the last condition here, it is in smapf.  So perhaps something like
> 
>  * Here, we cover the first three conditions.
>  * The CPL and X86_EFLAGS_AC is in smapf, which
>  * permission_fault() computes dynamically.
> 
> > +				 */
> > +				smap = smap && smapf && u && !uf;
> 
> SMAP does not affect instruction fetches.  Do you need "&& !ff" here?
> Perhaps
> it's clearer to add it even if it is not strictly necessary.
> 
> Please write just "smap = cr4_smap && u && !uf && !ff" here, and add back
> smapf below
> in the assignment to "fault".  This makes the code more homogeneous.
> 
> >  			} else
> >  				/* Not really needed: no U/S accesses on ept  */
> >  				u = 1;
> > -			fault = (ff && !x) || (uf && !u) || (wf && !w);
> > +			fault = (ff && !x) || (uf && !u) || (wf && !w) || smap;
> 
> ...
> 
> > +
> > +	/*
> > +	 * If CPL < 3, SMAP protections are disabled if EFLAGS.AC = 1.
> > +	 *
> > +	 * If CPL = 3, SMAP applies to all supervisor-mode data accesses
> > +	 * (these are implicit supervisor accesses) regardless of the value
> > +	 * of EFLAGS.AC.
> > +	 *
> > +	 * So we need to check CPL and EFLAGS.AC to detect whether there is
> > +	 * a SMAP violation.
> > +	 */
> > +
> > +	smapf = ((mmu->permissions[(pfec|PFERR_RSVD_MASK) >> 1] >>
> pte_access) &
> > +		 1) && !((cpl < 3) && ((rflags & X86_EFLAGS_AC) == 1));
> > +
> > +	return ((mmu->permissions[pfec >> 1] >> pte_access) & 1) || smapf;
> 
> You do not need two separate accesses.  Just add PFERR_RSVD_MASK to pfec
> if
> the conditions for SMAP are satisfied.  There are two possibilities:
> 
> 1) setting PFERR_RSVD_MASK if SMAP is being enforced, that is if CPL = 3
> || AC = 0.  This is what you are doing now.
> 
> 2) setting PFERR_RSVD_MASK if SMAP is being overridden, that is if CPL < 3
> && AC = 1.  You then have to invert the bit in update_permission_bitmask.
> 
> Please consider both choices, and pick the one that gives better code.
> 
> Also, this must be written in a branchless way.  Branchless tricks are common
> throughout the MMU code because the hit rate of most branches is pretty
> much
> 50%-50%.  This is also true in this case, at least if SMAP is in use (if it
> is not in use, we'll have AC=0 most of the time).
> 
> I don't want to spoil the fun, but I don't want to waste your time either
> so I rot13'ed my solution and placed it after the signature. ;)
> 
> As to nested virtualization, I reread the code and it should already work,
> because it sets PFERR_USER_MASK.  This means uf=1, and a SMAP fault will
> never trigger with uf=1.
> 
> Thanks for following this!  Please include "v3" in the patch subject on
> your next post!
> 
> Paolo
> 
> ------------------------------------- 8< --------------------------------------
> Nqq qrsvavgvbaf sbe CSREE_*_OVG (0 sbe cerfrag, 1 sbe jevgr, rgp.) naq
> hfr gur sbyybjvat:
> 
>         vag vaqrk, fznc;
> 
>         /*
>          * Vs PCY < 3, FZNC cebgrpgvbaf ner qvfnoyrq vs RSYNTF.NP = 1.
>          *
>          * Vs PCY = 3, FZNC nccyvrf gb nyy fhcreivfbe-zbqr qngn npprffrf
>          * (gurfr ner vzcyvpvg fhcreivfbe npprffrf) ertneqyrff bs gur inyhr
>          * bs RSYNTF.NP.
>          *
>          * Guvf pbzchgrf (pcy < 3) && (esyntf & K86_RSYNTF_NP), yrnivat
>          * gur erfhyg va K86_RSYNTF_NP.  Jr gura vafreg vg va cynpr
>          * bs gur CSREE_EFIQ_ZNFX ovg; guvf ovg jvyy nyjnlf or mreb va csrp,
>          * ohg vg jvyy or bar va vaqrk vs FZNC purpxf ner orvat bireevqqra.
>          * Vg vf vzcbegnag gb xrrc guvf oenapuyrff.
>          */
>         fznc = (pcy - 3) & (esyntf & K86_RSYNTF_NP);
>         vaqrk =
>            (csrp >> 1) +
>            (fznc >> (K86_RSYNTF_NP_OVG - CSREE_EFIQ_OVG + 1));
> 
>         erghea (zzh->crezvffvbaf[vaqrk] >> cgr_npprff) & 1;
> 
> Gur qverpgvba bs CSREE_EFIQ_ZNFX vf gur bccbfvgr pbzcnerq gb lbhe pbqr.

I find that the above code is unreadable, I tried to translate it and here is the result: (If I made mistakes while this transition, please correct me!)

--------------------------------------------------------------------------------------------------------------------
/*
 * If CPL < 3, SMAP protections are disabled if EFLAGS.AC = 1.
 *
 * If CPL = 3, SMAP applies to all supervisor-mode data accesses
 * (these are implicit supervisor accesses) regardless of the value
 * of EFLAGS.AC.
 *
 * This computes (cpl < 3) && (rflags & X86_EFLAGS_AC), leaving
 * the result in X86_EFLAGS_AC. We then insert it in place of
 * the PFERR_RSVD_MASK bit; this bit will always be zero in pfec,
 * but it will be one in index if SMAP checks are being overridden.
 * It is important to keep this branchless.
 */
smap = (cpl - 3) & (rflags & X86_EFLAGS_AC);
index =
        (pfec >> 1) +
        (smap >> (X86_EFLAGS_AC_BIT - PFERR_RSVD_BIT + 1));

return (mmu->permissions[index] >> pte_access) & 1;

The direction of PFERR_RSVD_MASK is the opposite compared to your code.
-------------------------------------------------------------------------------------------------------------------

I am a little confused about some points of the above code:
1. "smap = (cpl - 3) & (rflags & X86_EFLAGS_AC);" 
"smap" equals 1 when it is overridden and it is 0 when being enforced. So "index"
will be (pfec >> 1) when SMAP is enforced, but in my understanding of this case, we
should use the index with PFERR_RSVD_MASK bit being 1 in mmu-> permissions[]
to check the fault.
2. " smap >> (X86_EFLAGS_AC_BIT - PFERR_RSVD_BIT + 1)"
I am not quite understand this line. BTW, I cannot find the definition of "PFERR_RSVD_BIT",
Do you mean PFERR_RSVD_BIT equals 3?

Here is my understanding:
Basically, we can divide the array mmu->permissions[16] into two groups, one with
PFERR_RSVD_BIT being 1 while the other with this bit being 0 like below:

PFERR_RSVD_BIT:	is 0 (group 0)	is 1 (group 1)
				0000		0100
				0001		0100
				0010		0110
				0011		0111
				1000		1100
				1001		1101
				1010		1110
				1011		1111

I think the basic idea is using group 0 to check permission faults when !((cpl - 3) & (rflags & X86_EFLAGS_AC)), that is SMAP is overridden
while using group 1 to check faults when (cpl - 3) & (rflags & X86_EFLAGS_AC), that is SMAP is enforced.

Here is the code base on your proposal in my understanding:

-------------------------------------------------------------------------------------------------------------------
smap = !((cpl - 3) & (rflags & X86_EFLAGS_AC));
index =
        (pfec >> 1) + (smap << (PFERR_RSVD_BIT - 1)); /*assuming PFERR_RSVD_BIT == 3*/

return (mmu->permissions[index] >> pte_access) & 1;
-------------------------------------------------------------------------------------------------------------------

Could you please have a look at it? Appreciate your help! :)

Thanks,
Feng
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux