On 18.07.2011, at 20:08, Scott Wood wrote: > On Mon, 18 Jul 2011 18:33:58 +0200 > Alexander Graf <agraf@xxxxxxx> wrote: > >> >> On 18.07.2011, at 18:18, Scott Wood wrote: >> >>> They're pinned by get_user_pages_fast(). We (potentially) write to them, so >>> we should mark them dirty, because they are dirty. It's up to the rest >>> of Linux what to do with that. Will being pinned stop updates from being >>> written out if it is file-backed? And eventually the vm will be destroyed >>> (or the tlb reconfigured) and the pages will be unpinned. >> >> Hrm. How much overhead do we add to the exit-to-userspace path with this? > > Not sure -- probably not too much for anonymous memory, compared to the > rest of the cost of a heavyweight exit. On e500 the tlb array is 4 pages, > and each set_page_dirty_lock() will just do a few bit operations. Hm, ok. > >> I completely agree that we should mark them dirty when closing, but I'm >> not fully convinced a "we dirty them so we should declare them dirty at >> random times" pays off against possible substantial slowdowns due to the >> marking. Keep in mind that this is the MMIO case which isn't _that_ seldom. > > If we can convince ourselves nothing bad can happen, fine. I did it here > because this is the point at which the API says the contents of the memory > are well-defined. If it is file-backed, and userspace does a sync on a > heavyweight exit, shouldn't the the right thing get written to disk? Could > any other weird things happen? I'm not familiar enough with that part of > the kernel to say right away that it's safe. I'm neither, these are pretty subtile grounds. CC'ing Andrea and Johannes. Guys, would you please take a look at that patch and tell us if it's safe and a good thing to do what's being done here? We're talking about the following patch: http://www.spinics.net/lists/kvm-ppc/msg02961.html and specifically about: > +void kvmppc_core_heavy_exit(struct kvm_vcpu *vcpu) > +{ > + struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); > + int i; > + > + /* > + * We may have modified the guest TLB, so mark it dirty. > + * We only do it on an actual return to userspace, to avoid > + * adding more overhead to getting scheduled out -- and avoid > + * any locking issues with getting preempted in the middle of > + * KVM_CONFIG_TLB, etc. > + */ > + > + for (i = 0; i < vcpu_e500->num_shared_tlb_pages; i++) > + set_page_dirty_lock(vcpu_e500->shared_tlb_pages[i]); > } The background is that we want to have a few shared pages between kernel and user space to keep guest TLB information in. It's too much to shove around every time we need it in user space. > If we need to start making assumptions about what userspace is going to do > with this memory in order for it to be safe, then the restrictions should > be written into the API, and we should be sure that the performance gain is > worth it. Yes, I agree. I just have the feeling that what the dirty setting is trying to achieve is already guaranteed implicitly, but I also feel better asking some mm gurus :). Alex -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html