On Tue, 2012-09-11 at 12:05 +0100, Mel Gorman wrote: > > One *massive* change here that is not called out in the changelog is that > the reclaim path now holds the page lock on multiple pages at the same > time waiting for them to be batch unlocked in __remove_mapping_batch. > This is suspicious for two reasons. > > The first suspicion is that it is expected that there are filesystems > that lock multiple pages in page->index order and page reclaim tries to > lock pages in a random order. You are "ok" because you trylock the pages > but there should be a comment explaining the situation and why you're > ok. > > My *far* greater concern is that the hold time for a locked page is > now potentially much longer. You could lock a bunch of filesystem pages > and then call pageout() on an swapcache page that takes a long time to > write. This potentially causes a filesystem (or flusher threads etc) > to stall on lock_page and that could cause all sorts of latency trouble. > It will be hard to hit this bug and diagnose it but I believe it's > there. > > That second risk *really* must be commented upon and ideally reviewed by > the filesystem people. However, I very strongly suspect that the outcome > of such a review will be a suggestion to unlock the pages and reacquire > the lock in __remove_mapping_batch(). Bear in mind that if you take this > approach that you *must* use trylock when reacquiring the page lock and > handle being unable to lock the page. > Mel, Thanks for your detailed comments and analysis. If I unlock the pages, will flusher threads be the only things that will touch them? Or do I have to worry about potentially other things done to the pages that will make it invalid for me to unmap the pages later and put them on free list? Tim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>