On 08/12/2011 07:37 PM, Tomas Vondra wrote:I've run nearly 200 of these, and in about 10 cases I got something that looks like this: http://www.fuzzy.cz/tmp/pgbench/tps.png http://www.fuzzy.cz/tmp/pgbench/latency.png i.e. it runs just fine for about 3:40 and then something goes wrong. The bench should take 5:00 minutes, but it somehow locks, does nothing for about 2 minutes and then all the clients end at the same time. So instead of 5 minutes the run actually takes about 6:40. You need to run tests like these for 10 minutes to see the full cycle of things; then you'll likely see them on most runs, instead of only 5%. It's probably the case that some of your tests are finishing before the first checkpoint does, which is why you don't see the bad stuff every time. The long pauses are most likely every client blocking once the checkpoint sync runs. When those fsync calls go out, Linux will freeze for quite a while there on ext3. In this example, the drop in TPS/rise in latency at around 50:30 is either the beginning of a checkpoint or the dirty_background_ratio threshold in Linux being exceeded; they tend to happen around the same time. It executes the write phase for a bit, then gets into the sync phase around 51:40. You can find a couple of examples just like this on my giant test set around what was committed as the fsync compaction feature in 9.1, all at http://www.2ndquadrant.us/pgbench-results/index.htm The one most similar to your case is http://www.2ndquadrant.us/pgbench-results/481/index.html Had that test only run for 5 minutes, it would have looked just like yours, ending after the long pause that's in the middle on my run. The freeze was over 3 minutes long in that example. (My server has a fairly fast disk subsystem, probably faster than what you're testing, but it also has 8GB of RAM that it can dirty to more than make up for it). In my tests, I switched from ext3 to XFS to get better behavior. You got the same sort of benefit from ext4. ext3 just doesn't handle its write cache filling and then having fsync calls execute very well. I've given up on that as an unsolvable problem; improving behavior on XFS and ext4 are the only problems worth worrying about now to me. And I keep seeing too many data corruption issues on ext4 to recommend anyone use it yet for PostgreSQL, that's why I focused on XFS. ext4 still needs at least a few more months before all the bug fixes it's gotten in later kernels are backported to the 2.6.32 versions deployed in RHEL6 and Debian Squeeze, the newest Linux distributions my customers care about right now. On RHEL6 for example, go read http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/6.1_Technical_Notes/kernel.html , specifically BZ#635199, and you tell me if that sounds like it's considered stable code yet or not. "The block layer will be updated in future kernels to provide this more efficient mechanism of ensuring ordering...these future block layer improvements will change some kernel interfaces..." Yikes, that does not inspire confidence to me. -- Greg Smith 2ndQuadrant US greg@xxxxxxxxxxxxxxx Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us |