On Thu, Nov 12, 2009 at 03:27:48PM -0500, Chris Mason wrote: > On Thu, Nov 12, 2009 at 07:30:06PM +0000, Mel Gorman wrote: > > Sorry for the long delay in posting another version. Testing is extremely > > time-consuming and I wasn't getting to work on this as much as I'd have liked. > > > > Changelog since V2 > > o Dropped the kswapd-quickly-notice-high-order patch. In more detailed > > testing, it made latencies even worse as kswapd slept more on high-order > > congestion causing order-0 direct reclaims. > > o Added changes to how congestion_wait() works > > o Added a number of new patches altering the behaviour of reclaim > > > > Since 2.6.31-rc1, there have been an increasing number of GFP_ATOMIC > > failures. A significant number of these have been high-order GFP_ATOMIC > > failures and while they are generally brushed away, there has been a large > > increase in them recently and there are a number of possible areas the > > problem could be in - core vm, page writeback and a specific driver. The > > bugs affected by this that I am aware of are; > > Thanks for all the time you've spent on this one. Let me start with > some more questions about the workload ;) > > So the workload is gitk reading a git repo and a program reading data > over the network. Which part of the workload writes to disk? Sorry for the self reply, I started digging through your data (man, that's a lot of data ;). I took another tour through dm-crypt and things make more sense now. dm-crypt has two different single threaded workqueues for each dm-crypt device. The first one is meant to deal with the actual encryption and decryption, and the second one is meant to do the IO. So the path for a write looks something like this: filesystem -> crypt thread -> encrypt the data -> io thread -> disk And the path for read looks something like this: filesystem -> io thread -> disk -> crypt thread -> decrypt data -> FS One thread does encryption and one thread does IO, and these threads are shared for reads and writes. The end result is that all of the sync reads get stuck behind any async write congestion and all of the async writes get stuck behind any sync read congestion. It's almost like you need to check for both sync and async congestion before you have any hopes of a new IO making progress. The confusing part is that dm hasn't gotten any worse in this regard since 2.6.30 but the workload here is generating more sync reads (hopefully from gitk and swapin) than async writes (from the low bandwidth rsync). So in general if you were to change mm/*.c wait for sync congestion instead of async, things should appear better. The punch line is that the btrfs guy thinks we can solve all of this with just one more thread. If we change dm-crypt to have a thread dedicated to sync IO and a thread dedicated to async IO the system should smooth out. -chris -- To unsubscribe from this list: send the line "unsubscribe kernel-testers" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html