On 08/08/2011 03:49 AM, Johannes Weiner wrote: > On Mon, Jul 25, 2011 at 11:50:50PM +0200, Alexander Graf wrote: >> >> Well, alternatively we could simply bail out if the memory is not >> anonymous, right? Then the pinning on get_user_pages_fast should be >> enough. Johannes, would there be any downside to this approach? > > I don't see any correctness issues. Maybe Andrea does? > > While the userspace pages are never freed because of your reference, > it does not prevent reclaim from writing them to swap und unmapping > them from the user's page tables. Being unmapped from the user's page tables isn't a problem, as long as if the mapping is faulted back in before the I/O reference is released, it points at the same physical page. Anything else seems like it would break using get_free_pages() to implement read() -- you could be swapping out the wrong data. I hope that the "there may even be a completely different page there in some cases (eg. if mmapped pagecache has been invalidated and subsequently re faulted)" in the __get_user_pages() comment is referring to the !FOLL_WRITE case (or an explicit mapping change from userspace). This usage of get_free_pages() is pretty similar to how the guest's memory is dealt with. When the guest adds a TLB entry, get_user_pages_fast() gets called. It also doesn't get marked dirty until just before release, and userspace may access the memory before then (for debugging the guest, emulated DMA, etc). If that's not a problem, it shouldn't be a problem here either. -Scott -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html