Re: [PATCH 4/5] KVM: PPC: e500: eliminate a trap when entering idle

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 19.09.2011, at 20:46, Scott Wood wrote:

> On 09/19/2011 04:33 AM, Alexander Graf wrote:
>> 
>> On 15.09.2011, at 23:36, Scott Wood wrote:
>> 
>>> On 09/05/2011 05:30 PM, Alexander Graf wrote:
>>>> 
>>>> On 27.08.2011, at 01:31, Scott Wood wrote:
>>>> 
>>>>> +#ifdef CONFIG_E500
>>>>> +	/*
>>>>> +	 * Skip the overhead of HID0 accesses that KVM ignores --
>>>>> +	 * just write MSR[WE].
>>>>> +	 *
>>>>> +	 * We don't need _TLF_NAPPING, because under KVM we know
>>>>> +	 * it will take effect right away.
>>>>> +	 */
>>>>> +	if (ppc_md.power_save == e500_idle)
>>>>> +		ppc_md.power_save = kvm_msrwe_idle;
>>>> 
>>>> Why the if() here?
>>> 
>>> To avoid replacing some other power_save() implementation.
>>> kvm_msrwe_idle() is a paravirt-optimized version of e500_idle().
>>> 
>>> However, now that e500_idle has an ifdef for e500mc, we'll need that
>>> ifdef here as well.  e500mc doesn't use MSR[WE] (and if it did, we
>>> couldn't trap on it).  For e500mc we'll want to make an hcall for idle
>>> (ePAPR EV_IDLE).
>> 
>> Since we're PV'ing here either way, can't we simply define a generic power_save implementation that works across different CPU types? We could for example always use EV_IDLE.
> 
> I suppose so, though I'd be more comfortable knowing what it is we're
> substituting for.  E.g. maybe some no-op was put in power_save() because
> it was determined that the wake latency was too high relative to the
> workload.  Not something you'd usually expect with virtualization, but
> you never know.  There might be one guest that is meant to be
> prioritized and will be doing a lot of I/O, and the desire is to bunch
> up the time when other things run.

But at that point the guest needs to be modified too, no? Or is there a command line parameter that specifies which idle function to use? If so, we maybe want to read that one out to see if it's set manually and only change the default behavior?

> OTOH, detecting this by what function was put into power_save won't help
> if for similar reasons someone wants to run an actual wait instruction
> rather than trap to KVM.  So if such a case does come up, it should be
> via some explicit configuration that tells KVM to leave it alone.
> 
> In practice, with the current code, I think e500_idle is the only thing
> that will show up in power_save if CONFIG_E500 is defined.

That's what I assumed, yes. If we need something different, we better make it solid on all edges :)


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM Development]     [KVM ARM]     [KVM ia64]     [Linux Virtualization]     [Linux USB Devel]     [Linux Video]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux