Re: [PATCH v3] KVM: x86: use wrpkru directly in kvm_load_{guest|host}_xsave_state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/12/21 12:41 AM, Peter Zijlstra wrote:
> On Tue, May 11, 2021 at 01:05:02PM -0400, Jon Kohler wrote:
>> diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
>> index 8d33ad80704f..5bc4df3a4c27 100644
>> --- a/arch/x86/include/asm/fpu/internal.h
>> +++ b/arch/x86/include/asm/fpu/internal.h
>> @@ -583,7 +583,13 @@ static inline void switch_fpu_finish(struct fpu *new_fpu)
>>  		if (pk)
>>  			pkru_val = pk->pkru;
>>  	}
>> -	__write_pkru(pkru_val);
>> +
>> +	/*
>> +	 * WRPKRU is relatively expensive compared to RDPKRU.
>> +	 * Avoid WRPKRU when it would not change the value.
>> +	 */
>> +	if (pkru_val != rdpkru())
>> +		wrpkru(pkru_val);
> Just wondering; why aren't we having that in a per-cpu variable? The
> usual per-cpu MSR shadow approach avoids issuing any 'special' ops
> entirely.

It could be a per-cpu variable.  When I wrote this originally I figured
that a rdpkru would be cheaper than a load from memory (even per-cpu
memory).

But, now that I think about it, assuming that 'prku_val' is in %rdi, doing:

	cmp	%gs:0x1234, %rdi

might end up being cheaper than clobbering a *pair* of GPRs with rdpkru:

	xor    %ecx,%ecx
	rdpkru
	cmp	%rax, %rdi

I'm too lazy to go figure out what would be faster in practice, though.
 Does anyone care?




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux