Re: [Question] About the behavior of HLT in VMX guest mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Radim,

On 2017/3/16 22:23, Radim Krčmář wrote:

> 2017-03-16 10:08+0800, Longpeng (Mike):
>> Hi, Radim,
>>
>> On 2017/3/16 1:32, Radim Krčmář wrote:
>>
>>> 2017-03-13 15:12+0800, Longpeng (Mike):
>>>> Hi guys,
>>>>
>>>> I'm confusing about the behavior of HLT instruction in VMX guest mode.
>>>>
>>>> I set "hlt exiting" bit to 0 in VMCS, and the vcpu didn't vmexit when execute
>>>> HLT as expected. However, I used powertop/cpupower on host to watch the pcpu's
>>>> c-states, it seems that the pcpu didn't enter C1/C1E state during this period.
>>>>
>>>> I searched the Intel spec vol-3, and only found that guest MWAIT won't entering
>>>> a low-power sleep state under certain conditions(ch 25.3), but not mentioned HLT.
>>>>
>>>> My questions are
>>>> 1) Does executing HLT instruction in guest-mode won't enter C1/C1E state ?
>>>
>>> Do you get a different result when running HLT outside VMX?
>>>
>>
>> Yep, I'm sure that executing HLT in host will enter C1/C1E state, but it won't
>> when executing in guest.
> 
> I'd go for the thermal monitoring (ideally with constant fan speed) if
> CPU counters are lacking.  Thermal sensors are easily accessible and far
> more trustworthy for measuring power saving. :)
> 
>>>> 2) If it won't, then whether it would release the hardware resources shared with
>>>> another hyper-thread ?
>>>
>>
>>> No idea.  Aren't hyperthreaded resources scheduled dynamically, so even
>>> a nop-spinning VCPU won't hinder the other hyper-thread?
>>>
>>
>>
>> I had wrote a testcase in kvm-unit-tests, and it seems that guest-mode HLT-ed
>> vcpu won't compete the hardware resources( maybe including the pipeline ) any more.
>>
>> My testcase is: binding vcpu1 and vcpu2 to a core's 2 hyper-threads, and
>>
>> (vcpu1)
>> t1 = rdtsc();
>> for (int i = 0; i < 10000000; ++i) ;
>> t2 = rdtsc();
>> costs = t2 - t1;
>>
>> (vcpu2)
>> "halt" or "while (1) ;"
>>
>> The result is:
>> -----------------------------------------------------------------------
>> 			(vcpu2)idle=poll	(vcpu2)idle=halt
>> (HLT exiting=1)
>> vcpu1 costs		3800931			1900209
>>
>> (HLT exiting=0)
>> vcpu1 costs		3800193			1913514
>> -----------------------------------------------------------------------
> 
> Oh, great results.
> I wonder if the slightly better time on HLT exiting=1 is because the
> other hyper-thread goes into deeper sleep after exit.


Yes, maybe.

Another potential reason is maybe the host's overhead is lower.
For "HLT exiting=1 && idle=halt" the host is idle and the cpu-usage close to 0%,
while for "HLT exiting=0 && idle=halt" the host is actually very busy and the
cpu-usage close to 100%. Maybe host kernel would do more work when it's busy.

> Btw. does adding pause() into the while loop bring the performance close
> to halt?


Good suggestion! :)

I tested pause() into poll loop and set "ple_gap=0" just now, the performance is
much better than "while(1) ;", but it's still obvious slower than halt.
-----------------------------------------------------------------------
 			(vcpu2)poll	(vcpu2)pause	(vcpu2)halt
 (HLT exiting=1)
 vcpu1 costs		3800931		2572812		1916724

 (HLT exiting=0)
 vcpu1 costs		3800193		2573685		1912443
-----------------------------------------------------------------------

> 
>> I found that https://www.spinics.net/lists/kvm-commits/msg00137.html had maked
>> "HLT exiting" configurable, while
>> http://lkml.iu.edu/hypermail/linux/kernel/1202.0/03309.html removed it due to
>> redundant with CFS hardlimit.
>>
>> I focus on the VM's performance. According the result, I think running HLT in
>> guest-mode is better than idle=poll with HLT-exiting in *certain* scenarios.
> 
> Yes, and using MWAIT for idle is even better than HLT (you can be woken


Yes, agree.

> up without IPI) -- any reason to prefer HLT?
> 


In my humble opinion:

1) As "Intel sdm vol3 ch25.3" says, MWAIT operates normally (I think includes
entering deeper sleep) under certain conditions.
Some deeper sleep modes(such as C4E/C6/C7) will clear the L1/L2/L3 cache.
This is insecurity if we don't take other protective measures(such as limit the
guest's max-cstate, it's fortunately that power subsystem isn't supported by
QEMU, but we should be careful for some special-purpose in case). While HLT in
guest mode can't cause hardware into sleep.

2) According to the "Intel sdm vol3 ch26.3.3 & ch27.5.6", I think MONITOR in
guest mode can't work as perfect as in host sometimes.
For example, a vcpu MONITOR a address and then MWAIT, if a external-intr(suppose
this intr won't cause to inject any virtual events ) cause VMEXIT, the monitor
address will be cleaned, so the MWAIT won't be waken up by a store operation to
the monitored address any more.

But I'm glad to do some tests if time permits, thanks :)

Radim, how about to make HLT-exiting configurable again in upstream ? If you
like it, there is a problem should be resolved, asynpf is conflict with
"HLT-exiting = 0" in certain situations.

> Thanks.
> 
> .
> 


-- 
Regards,
Longpeng(Mike)




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux