On Tue, Dec 13, 2016 at 02:08:22PM -0800, Dave Hansen wrote: > On 12/13/2016 01:24 PM, Jerome Glisse wrote: > > > >>> > > From kernel point of view such memory is almost like any other, it > >>> > > has a struct page and most of the mm code is non the wiser, nor need > >>> > > to be about it. CPU access trigger a migration back to regular CPU > >>> > > accessible page. > >> > > >> > That sounds ... complex. Page migration on page cache access inside > >> > the filesytem IO path locking during read()/write() sounds like > >> > a great way to cause deadlocks.... > > There are few restriction on device page, no one can do GUP on them and > > thus no one can pin them. Hence they can always be migrated back. Yes > > each fs need modification, most of it (if not all) is isolated in common > > filemap helpers. > > Huh, that's pretty different from the other ZONE_DEVICE uses. For > those, you *can* do get_user_pages(). > > I'd be really interested to see the feature set that these pages have > and how it differs from regular memory and the ZONE_DEVICE memory that > have have in-kernel today. Well i can do a list for current patchset where i do not allow migration of file back page. Roughly you can not kmap and GUP. But GUP has many more implications like direct I/O (source or destination of direct I/O) ... > > BTW, how is this restriction implemented? I would have expected to see > follow_page_pte() or vm_normal_page() getting modified. I don't see a > single reference to get_user_pages or "GUP" in any of the latest HMM > patch set or the changelogs. > > As best I can tell, the slow GUP path will get stuck in a loop inside > follow_page_pte(), while the fast GUP path will allow you to acquire a > reference to the page. But, maybe I'm reading the code wrong. It is a side effect of having a special swap pte so follow_page_pte() returns NULL which trigger page fault through handle_mm_fault() which trigger migration back to regular page. Same for fast GUP version. There is never a valid pte for an un-addressable page. Cheers, Jérome -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html