On 05/21/2018 07:38 AM, Matthew Wilcox wrote: > On Fri, May 18, 2018 at 08:51:38PM -0700, Dan Williams wrote: -------------8<------------------------------------------------ > > Suggestion 1: > > in get_user_pages_fast(), mark the page as dirty, but don't tag the radix > tree entry as dirty. Then vmscan() won't find it when it's looking to > write out dirty pages. Only mark it as dirty in the radix tree once we > call set_page_dirty_lock(). > > Suggestion 2: > > in get_user_pages_fast(), replace the page in the radix tree with a special > entry that means "page under io". In set_page_dirty_lock(), replace the > "page under io" entry with the struct page pointer. This second one feels a simpler to me. If no one sees huge problems with this, I can put this together and try it out, because I have a few nicely reproducible bugs that I can test this on. But with either approach, a quick question first: will this do the right thing for the other two use cases below? a) ftruncate b) deleting the inode and dropping all references to it (only the get_user_pages reference remains) ...or is some other way to sneak in and try_to_free_buffers() on a page in this state? Also, just to be sure I'm on the same page, is it accurate to claim that we would then have the following updated guidelines for device drivers and user space? 1. You can safely DMA to file-backed memory that you've pinned via get_user_pages (with the usual caveats about getting the pages you think you're getting), if you are careful to avoid truncating or deleting the file out from under get_user_pages. In other words, this pattern is supported: get_user_pages (on file-backed memory from a persistent storage filesystem) ...do some DMA set_page_dirty_lock put_page 2. Furthermore, even if you are less careful, you still won't crash the kernel, The worst that could happen is to corrupt your data, due to interrupting the writeback. The possibility of data corruption is bad, but it's also arguably both self-inflicted and avoidable. Anyway, even so, it's an improvement: the bugs I'm seeing would definitely get fixed with this. > > Both of these suggestions have trouble with simultaneous sub-page IOs to the > same page. Do we care? I suspect we might as pages get larger (see also: > supporting THP pages in the page cache). > I don't *think* we care. At least, no examples occur to me where this would cause a problem. thanks, -- John Hubbard NVIDIA