Hi Andrew, Vlastimil On Fri, Jan 03, 2014 at 04:18:05PM -0800, Linus Torvalds wrote: >On Fri, Jan 3, 2014 at 3:36 PM, Vlastimil Babka <vbabka@xxxxxxx> wrote: >> >> I'm for going with the removal of BUG_ON. The TestSetPageMlocked should provide enough >> race protection. > >Maybe. But dammit, that's subtle, and I don't think you're even right. > >It basically depends on mlock_vma_page() and munlock_vma_page() being >able to run CONCURRENTLY on the same page. In particular, you could >have a mlock_vma_page() set the bit on one CPU, and munlock_vma_page() >immediately clearing it on another, and then the rest of those >functions could run with a totally arbitrary interleaving when working >with the exact same page. > >They both do basically > > if (!isolate_lru_page(page)) > putback_lru_page(page); > >but one or the other would randomly win the race (it's internally >protected by the lru lock), and *if* the munlock_vma_page() wins it, >it would also do > > try_to_munlock(page); > >but if mlock_vma_page() wins it, that wouldn't happen. That looks >entirely broken - you end up with the PageMlocked bit clear, but >try_to_munlock() was never called on that page, because >mlock_vma_page() got to the page isolation before the "subsequent" >munlock_vma_page(). > >And this is very much what the page lock serialization would prevent. >So no, the PageMlocked in *no* way gives serialization. It's an atomic >bit op, yes, but that only "serializes" in one direction, not when you >can have a mix of bit setting and clearing. > >So quite frankly, I think you're wrong. The BUG_ON() is correct, or at >least enforces some kind of ordering. And try_to_unmap_cluster() is >just broken in calling that without the page being locked. That's my >opinion. There may be some *other* reason why it all happens to work, >but no, "TestSetPageMlocked should provide enough race protection" is >simply not true, and even if it were, it's way too subtle and odd to >be a good rule. > >So I really object to just removing the BUG_ON(). Not with a *lot* >more explanation as to why these kinds of issues wouldn't matter. Any better idea to improve my patch which in the "lock the page around calling it" direction. ;-) http://marc.info/?l=linux-mm&m=138733994417230&w=2 Regards, Wanpeng Li > > Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>