On 19.07.2011, at 10:36, Johannes Weiner wrote: > On Mon, Jul 18, 2011 at 11:44:02PM +0200, Alexander Graf wrote: >> >> On 18.07.2011, at 20:08, Scott Wood wrote: >> >>> On Mon, 18 Jul 2011 18:33:58 +0200 >>> Alexander Graf <agraf@xxxxxxx> wrote: >>> >>>> >>>> On 18.07.2011, at 18:18, Scott Wood wrote: >>>> >>>>> They're pinned by get_user_pages_fast(). We (potentially) write to them, so >>>>> we should mark them dirty, because they are dirty. It's up to the rest >>>>> of Linux what to do with that. Will being pinned stop updates from being >>>>> written out if it is file-backed? And eventually the vm will be destroyed >>>>> (or the tlb reconfigured) and the pages will be unpinned. >>>> >>>> Hrm. How much overhead do we add to the exit-to-userspace path with this? >>> >>> Not sure -- probably not too much for anonymous memory, compared to the >>> rest of the cost of a heavyweight exit. On e500 the tlb array is 4 pages, >>> and each set_page_dirty_lock() will just do a few bit operations. >> >> Hm, ok. >> >>> >>>> I completely agree that we should mark them dirty when closing, but I'm >>>> not fully convinced a "we dirty them so we should declare them dirty at >>>> random times" pays off against possible substantial slowdowns due to the >>>> marking. Keep in mind that this is the MMIO case which isn't _that_ seldom. >>> >>> If we can convince ourselves nothing bad can happen, fine. I did it here >>> because this is the point at which the API says the contents of the memory >>> are well-defined. If it is file-backed, and userspace does a sync on a >>> heavyweight exit, shouldn't the the right thing get written to disk? Could >>> any other weird things happen? I'm not familiar enough with that part of >>> the kernel to say right away that it's safe. >> >> I'm neither, these are pretty subtile grounds. CC'ing Andrea and >> Johannes. Guys, would you please take a look at that patch and tell >> us if it's safe and a good thing to do what's being done here? >> >> We're talking about the following patch: http://www.spinics.net/lists/kvm-ppc/msg02961.html >> and specifically about: >> >>> +void kvmppc_core_heavy_exit(struct kvm_vcpu *vcpu) >>> +{ >>> + struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); >>> + int i; >>> + >>> + /* >>> + * We may have modified the guest TLB, so mark it dirty. >>> + * We only do it on an actual return to userspace, to avoid >>> + * adding more overhead to getting scheduled out -- and avoid >>> + * any locking issues with getting preempted in the middle of >>> + * KVM_CONFIG_TLB, etc. >>> + */ >>> + >>> + for (i = 0; i < vcpu_e500->num_shared_tlb_pages; i++) >>> + set_page_dirty_lock(vcpu_e500->shared_tlb_pages[i]); >>> } >> >> The background is that we want to have a few shared pages between >> kernel and user space to keep guest TLB information in. It's too >> much to shove around every time we need it in user space. > > Is there a strict requirement to have these pages originate from > userspace? Usually, shared memory between kernel and userspace is > owned by the driver and kept away from the mm subsystem completely. > > You could allocate the memory in the driver when userspace issues an > ioctl like KVM_ALLOC_TLB_CONFIG and return a file handle that can be > mmap'd. The array length info is maintained in the vma for the > kernel, userspace must remember the size of mmap regions anyway. Hrm. What's the advantage of doing it this way around vs the other? Alex -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html