Hi Nick! On Wed, May 02, 2012 at 01:50:46AM +1000, Nick Piggin wrote: > KOSAKI-san is correct, I think. > > The race is something like this: > > DIO-read > page = get_user_pages() > fork() > COW(page) > touch(page) > DMA(page) > page_cache_release(page); Yes. More in general this race happens every time the kernel wrprotect a writable anon pte, if get_user_pages had a pin on the page while the pte is being wrprotected. fork can't just abort (like KSM does) when it notices mapcount < page_count. The only way to avoid this, is that somehow the GUP-pinned page should remain pointed at all times by the pte of the process that pinned the page (no matter the cows), and that's not happening. > So whether parent or child touches the page, determines who gets the > actual DMA target, and who gets the copy. Correct, so far there are two reproducers, triggering two different kind of corruption. The corruption may appear in different ways: 1) we could lose the direct-io read in the parent (if the forked child does nothing and just quits), that was the basic case in dma_thread.c, a dummy fork was run just to mark the pte wrprotected 2) the destination of the direct-io read may also become visible to the child if the child written to the page before the I/O is complete, leading to random mm corruption in the child 3) it's a direct-io write, then the child could write random data to disk by accident without noticing, if the DMA wasn't started yet and the child got the pinned page mapped in the child pte We had two working fixes for this and personally I'd prefer to apply them than to document the bug. The probability that who writes code that can hit the bug is reading the note in the manpage seems pretty small, especially in the short/mid term. This lkml thread as reminder may actually have higher chance of being noticed than the manpage maybe. Nevertheless documenting it is better than nothing if the fixes aren't applied :). However I'm afraid after we officially document it the chances of fixing it becomes zero. > 2 threads are not required, but it makes the race easier to code and a > larger window, I suspect. > > It can also be hit with a single thread, using AIO. Yes, it requires running fork in the same process that pinned a page with GUP, and then writing to a buffer in the same page that is under the GUP pin before the GUP pin is released. It's not just direct-io, and not just direct-io read (see point 3). -- To unsubscribe from this list: send the line "unsubscribe linux-man" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html