Re: [PATCH v5 5/5] KVM: PPC: e500: MMU API

Alexander Graf <agraf@xxxxxxx> · Mon, 18 Jul 2011 23:44:02 +0200

On 18.07.2011, at 20:08, Scott Wood wrote:

> On Mon, 18 Jul 2011 18:33:58 +0200
> Alexander Graf <agraf@xxxxxxx> wrote:
> 
>> 
>> On 18.07.2011, at 18:18, Scott Wood wrote:
>> 
>>> They're pinned by get_user_pages_fast().  We (potentially) write to them, so
>>> we should mark them dirty, because they are dirty.  It's up to the rest
>>> of Linux what to do with that.  Will being pinned stop updates from being
>>> written out if it is file-backed?  And eventually the vm will be destroyed
>>> (or the tlb reconfigured) and the pages will be unpinned.
>> 
>> Hrm. How much overhead do we add to the exit-to-userspace path with this?
> 
> Not sure -- probably not too much for anonymous memory, compared to the
> rest of the cost of a heavyweight exit.  On e500 the tlb array is 4 pages,
> and each set_page_dirty_lock() will just do a few bit operations.

Hm, ok.

> 
>> I completely agree that we should mark them dirty when closing, but I'm
>> not fully convinced a "we dirty them so we should declare them dirty at
>> random times" pays off against possible substantial slowdowns due to the
>> marking. Keep in mind that this is the MMIO case which isn't _that_ seldom.
> 
> If we can convince ourselves nothing bad can happen, fine.  I did it here
> because this is the point at which the API says the contents of the memory
> are well-defined.  If it is file-backed, and userspace does a sync on a
> heavyweight exit, shouldn't the the right thing get written to disk?  Could
> any other weird things happen?  I'm not familiar enough with that part of
> the kernel to say right away that it's safe.

I'm neither, these are pretty subtile grounds. CC'ing Andrea and Johannes. Guys, would you please take a look at that patch and tell us if it's safe and a good thing to do what's being done here?

We're talking about the following patch: http://www.spinics.net/lists/kvm-ppc/msg02961.html
and specifically about:

> +void kvmppc_core_heavy_exit(struct kvm_vcpu *vcpu)
> +{
> +	struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu);
> +	int i;
> +
> +	/*
> +	 * We may have modified the guest TLB, so mark it dirty.
> +	 * We only do it on an actual return to userspace, to avoid
> +	 * adding more overhead to getting scheduled out -- and avoid
> +	 * any locking issues with getting preempted in the middle of
> +	 * KVM_CONFIG_TLB, etc.
> +	 */
> +
> +	for (i = 0; i < vcpu_e500->num_shared_tlb_pages; i++)
> +		set_page_dirty_lock(vcpu_e500->shared_tlb_pages[i]);
>  }

The background is that we want to have a few shared pages between kernel and user space to keep guest TLB information in. It's too much to shove around every time we need it in user space.

> If we need to start making assumptions about what userspace is going to do
> with this memory in order for it to be safe, then the restrictions should
> be written into the API, and we should be sure that the performance gain is
> worth it.

Yes, I agree. I just have the feeling that what the dirty setting is trying to achieve is already guaranteed implicitly, but I also feel better asking some mm gurus :).

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html