Re: [PATCHv5 4/4] KVM: emulator: optimize "rep ins" handling.

Avi Kivity <avi@xxxxxxxxxx> · Mon, 06 Aug 2012 15:08:28 +0300

On 08/06/2012 02:49 PM, Gleb Natapov wrote:
> On Mon, Aug 06, 2012 at 02:39:52PM +0300, Avi Kivity wrote:
>> On 08/06/2012 02:05 PM, Gleb Natapov wrote:
>> > On Mon, Aug 06, 2012 at 12:28:05PM +0300, Avi Kivity wrote:
>> >> On 08/06/2012 11:58 AM, Gleb Natapov wrote:
>> >> > On Mon, Aug 06, 2012 at 11:50:20AM +0300, Avi Kivity wrote:
>> >> >> On 07/30/2012 05:38 PM, Gleb Natapov wrote:
>> >> >> > Optimize "rep ins" by allowing emulator to write back more than one
>> >> >> > datum at a time. Introduce new operand type OP_MEM_STR which tells
>> >> >> > writeback() that dst contains pointer to an array that should be written
>> >> >> > back as opposite to just one data element.
>> >> >> > 
>> >> >> >  	}
>> >> >> >  
>> >> >> > -	memcpy(dest, rc->data + rc->pos, size);
>> >> >> > -	rc->pos += size;
>> >> >> > +	if (ctxt->rep_prefix && !(ctxt->eflags & EFLG_DF)) {
>> >> >> > +		ctxt->dst.data = rc->data + rc->pos;
>> >> >> > +		ctxt->dst.type = OP_MEM_STR;
>> >> >> > +		ctxt->dst.count = (rc->end - rc->pos) / size;
>> >> >> > +		rc->pos = rc->end;
>> >> >> 
>> >> >> Should take into account the segment limit.
>> >> >> 
>> >> > It does. During write back. pio_in_emulated() should linearize() address
>> >> > before calculating page boundary, but this is (minor) bug unrelated to the patch
>> >> > series.
>> >> 
>> >> I see, yes, this problem preexists.
>> >> 
>> >> However, in normal conditions, non-repeating instructions will not reach
>> >> the emulator at all since they will fault in the guest (or in the shadow
>> >> mmu, which will reflect the fault to the guest).  Here, the first
>> >> iteration may fit in the segment but the second will not, so this will fail.
>> >> 
>> > Correct. And this can happen with or without the patch series.
>> 
>> No, it can't.  Ordinarily ins will trap inside the guest.
>> 
> We do not go to a guest for each iteration. In fact we will not go to a
> guest for exactly "count" iterations.

Ok.

If we linearize and translate pre-execution we can keep track of the
remaining space available in the segment/page and do this correctly.

>> >> > This brings us back to the question what alignment check is doing in
>> >> > linearize :)
>> >> 
>> >> It's checking alignment...
>> >> 
>> > It either check it in a wrong place or we need to mark all instructions
>> > that do not care about alignment, so the patch is not "Eww" :)
>> 
>> If not there, where?
>> 
> During execution if instruction requires alignment? 

Too many (all sse) instructions require alignment.

> Why don't you like marking
> instruction as Unaligned?

Because it's a workaround for a side effect of the implementation.  At a
minimum it needs a comment.

>> >> 
>> > Execution state likely. String instruction works on segmented address
>> > for instance (address increment/decrement). May be there are others.
>> 
>> Practically everything works on segmented addresses.
>> 
> Hmm, true. We can calculate liner address whenever it is needed and
> cache it. If address changes cache is invalidated.

The correct thing is to check before, like the processor does.  For
example linearize also checks write permissions, so for RMW it needs to
check writes before performing the first read.

Also cmovcc performs the checks even though it might not perform the access.

-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html