Re: Weird XFS WAL problem

Merlin Moncure <mmoncure@xxxxxxxxx> · Thu, 3 Jun 2010 09:01:01 -0400

On Wed, Jun 2, 2010 at 7:30 PM, Craig James <craig_james@xxxxxxxxxxxxxx> wrote:
> I'm testing/tuning a new midsize server and ran into an inexplicable
> problem.  With an RAID10 drive, when I move the WAL to a separate RAID1
> drive, TPS drops from over 1200 to less than 90!   I've checked everything
> and can't find a reason.
>
> Here are the details.
>
> 8 cores (2x4 Intel Nehalem 2 GHz)
> 12 GB memory
> 12 x 7200 SATA 500 GB disks
> 3WARE 9650SE-12ML RAID controller with bbu
>  2 disks: RAID1  500GB ext4  blocksize=4096
>  8 disks: RAID10 2TB, stripe size 64K, blocksize=4096 (ext4 or xfs - see
> below)
>  2 disks: hot swap
> Ubuntu 10.04 LTS (Lucid)
>
> With xfs or ext4 on the RAID10 I got decent bonnie++ and pgbench results
> (this one is for xfs):
>
> Version 1.03e       ------Sequential Output------ --Sequential Input-
> --Random-
>                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
> --Seeks--
> Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec
> %CP
> argon        24064M 70491  99 288158  25 129918  16 65296  97 428210  23
> 558.9   1
>                    ------Sequential Create------ --------Random
> Create--------
>                    -Create-- --Read--- -Delete-- -Create-- --Read---
> -Delete--
>              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec
> %CP
>                 16 23283  81 +++++ +++ 13775  56 20143  74 +++++ +++ 15152
>  54
> argon,24064M,70491,99,288158,25,129918,16,65296,97,428210,23,558.9,1,16,23283,81,+++++,+++,13775,56,20143\
> ,74,+++++,+++,15152,54
>
> pgbench -i -s 100 -U test
> pgbench -c 10 -t 10000 -U test
>    scaling factor: 100
>    query mode: simple
>    number of clients: 10
>    number of transactions per client: 10000
>    number of transactions actually processed: 100000/100000
>    tps = 1046.104635 (including connections establishing)
>    tps = 1046.337276 (excluding connections establishing)
>
> Now the mystery: I moved the pg_xlog directory to a RAID1 array (same 3WARE
> controller, two more SATA 7200 disks).  Run the same tests and ...
>
>    tps = 82.325446 (including connections establishing)
>    tps = 82.326874 (excluding connections establishing)
>
> I thought I'd made a mistake, like maybe I moved the whole database to the
> RAID1 array, but I checked and double checked.  I even watched the lights
> blink - the WAL was definitely on the RAID1 and the rest of Postgres on the
> RAID10.
>
> So I moved the WAL back to the RAID10 array, and performance jumped right
> back up to the >1200 TPS range.
>
> Next I check the RAID1 itself:
>
>  dd if=/dev/zero of=./bigfile bs=8192 count=2000000
>
> which yielded 98.8 MB/sec - not bad.  bonnie++ on the RAID1 pair showed good
> performance too:
>
> Version 1.03e       ------Sequential Output------ --Sequential Input-
> --Random-
>                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
> --Seeks--
> Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec
> %CP
> argon        24064M 68601  99 110057  18 46534   6 59883  90 123053   7
> 471.3   1
>                    ------Sequential Create------ --------Random
> Create--------
>                    -Create-- --Read--- -Delete-- -Create-- --Read---
> -Delete--
>              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec
> %CP
>                 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++
> +++
> argon,24064M,68601,99,110057,18,46534,6,59883,90,123053,7,471.3,1,16,+++++,+++,+++++,+++,+++++,+++,+++++,\
> +++,+++++,+++,+++++,+++
>
> So ... anyone have any idea at all how TPS drops to below 90 when I move the
> WAL to a separate RAID1 disk?  Does this make any sense at all?  It's
> repeatable. It happens for both ext4 and xfs. It's weird.
>
> You can even watch the disk lights and see it: the RAID10 disks are on
> almost constantly when the WAL is on the RAID10, but when you move the WAL
> over to the RAID1, its lights are dim and flicker a lot, like it's barely
> getting any data, and the RAID10 disk's lights barely go on at all.

*) Is your raid 1 configured writeback cache on the controller?
*) have you tried changing wal_sync_method to fdatasync?

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance