2014-04-15 0:32 GMT+02:00 Mel Llaguno <mllaguno@xxxxxxxxxxxx>:
I was given anecdotal information regarding HFS+ performance under OSX as
being unsuitable for production PG deployments and that pg_test_fsync
could be used to measure the relative speed versus other operating systems
(such as Linux). In my performance lab, I have a number of similarly
equipped Linux hosts (Ubuntu 12.04 64-bit LTS Server w/128Gb RAM / 2 OWC
6g Mercury Extreme SSDs / 7200rpm SATA3 HDD / 16 E5-series cores) that I
used to capture baseline Linux numbers. As we generally recommend our
customers use SSD (the s3700 recommended by PG), I wanted to perform a
comparison. On these beefy machines I ran the following tests:
SSD:
# pg_test_fsync -f ./fsync.out -s 30
30 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync n/a
fdatasync 2259.652 ops/sec 443 usecs/op
fsync 1949.664 ops/sec 513 usecs/op
fsync_writethrough n/a
open_sync 2245.162 ops/sec 445 usecs/op
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync n/a
fdatasync 2161.941 ops/sec 463 usecs/op
fsync 1891.894 ops/sec 529 usecs/op
fsync_writethrough n/a
open_sync 1118.826 ops/sec 894 usecs/op
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write 2171.558 ops/sec 460 usecs/op
2 * 8kB open_sync writes 1126.490 ops/sec 888 usecs/op
4 * 4kB open_sync writes 569.594 ops/sec 1756 usecs/op
8 * 2kB open_sync writes 285.149 ops/sec 3507 usecs/op
16 * 1kB open_sync writes 142.528 ops/sec 7016 usecs/op
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 1947.557 ops/sec 513 usecs/op
write, close, fsync 1951.082 ops/sec 513 usecs/op
Non-Sync'ed 8kB writes:
write 481296.909 ops/sec 2 usecs/op
HDD:
pg_test_fsync -f /tmp/fsync.out -s 30
30 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync n/a
fdatasync 105.783 ops/sec 9453 usecs/op
fsync 27.692 ops/sec 36111 usecs/op
fsync_writethrough n/a
open_sync 103.399 ops/sec 9671 usecs/op
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync n/a
fdatasync 104.647 ops/sec 9556 usecs/op
fsync 27.223 ops/sec 36734 usecs/op
fsync_writethrough n/a
open_sync 55.839 ops/sec 17909 usecs/op
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write 103.581 ops/sec 9654 usecs/op
2 * 8kB open_sync writes 55.207 ops/sec 18113 usecs/op
4 * 4kB open_sync writes 28.320 ops/sec 35311 usecs/op
8 * 2kB open_sync writes 14.581 ops/sec 68582 usecs/op
16 * 1kB open_sync writes 7.407 ops/sec 135003 usecs/op
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 27.228 ops/sec 36727 usecs/op
write, close, fsync 27.108 ops/sec 36890 usecs/op
Non-Sync'ed 8kB writes:
write 466108.001 ops/sec 2 usecs/op
-------
So far, so good. Local HDD vs. SSD shows a significant difference in fsync
performance. Here are the corresponding fstab entries :
/dev/mapper/cim-base
/opt/cim ext4 defaults,noatime,nodiratime,discard 0 2 (SSD)
/dev/mapper/p--app--lin-root / ext4 errors=remount-ro 0
1 (HDD)
I then tried the pg_test_fsync on my OSX Mavericks machine (quad-core i7 /
Intel 520SSD / 16GB RAM) and got the following results :
# pg_test_fsync -s 30 -f ./fsync.out
30 seconds per test
Direct I/O is not supported on this platform.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 8752.240 ops/sec 114 usecs/op
fdatasync 8556.469 ops/sec 117 usecs/op
fsync 8831.080 ops/sec 113 usecs/op
fsync_writethrough 735.362 ops/sec 1360 usecs/op
open_sync 8967.000 ops/sec 112 usecs/op
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 4256.906 ops/sec 235 usecs/op
fdatasync 7485.242 ops/sec 134 usecs/op
fsync 7335.658 ops/sec 136 usecs/op
fsync_writethrough 716.530 ops/sec 1396 usecs/op
open_sync 4303.408 ops/sec 232 usecs/op
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write 7559.381 ops/sec 132 usecs/op
2 * 8kB open_sync writes 4537.573 ops/sec 220 usecs/op
4 * 4kB open_sync writes 2539.780 ops/sec 394 usecs/op
8 * 2kB open_sync writes 1307.499 ops/sec 765 usecs/op
16 * 1kB open_sync writes 659.985 ops/sec 1515 usecs/op
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 9003.622 ops/sec 111 usecs/op
write, close, fsync 8035.427 ops/sec 124 usecs/op
Non-Sync'ed 8kB writes:
write 271112.074 ops/sec 4 usecs/op
-------
These results were unexpected and surprising. In almost every metric (with
the exception of the Non-Sync¹d 8k8 writes), OSX Mavericks 10.9.2 using
HFS+ out-performed my Ubuntu servers. While the SSDs come from different
manufacturers, both use the SandForce SF-2281 controllers.
Plausible explanations of the apparent disparity in fsync performance
would be welcome.
Thanks, Mel
P.S. One more thing; I found this article which maps fsync mechanisms
versus
operating systems :
http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm
This article suggests that both open_datasync and fdatasync are _not_
supported for OSX, but the pg_test_fsync results suggest otherwise.
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
My 2 cents :
The results are not surprising, in the linux enviroment the i/o call of pg_test_fsync are using O_DIRECT (PG_O_DIRECT) with also the O_SYNC or O_DSYNC calls, so ,in practice, it is waiting the "answer" from the storage bypassing the cache in sync mode, while in the Mac OS X it is not doing so, it's only using the O_SYNC or O_DSYNC calls without O_DIRECT, in practice, it's using the cache of filesystem , even if it is asking the sync of io calls.
Bye
Mat Dba