Re: Raid 10 chunksize

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 3 Apr 2009, Greg Smith wrote:

Hannes sent this off-list, presumably via newsgroup, and it's certainly worth sharing. I've always been scared off of using XFS because of the problems outlined at http://zork.net/~nick/mail/why-reiserfs-is-teh-sukc , with more testing showing similar issues at http://pages.cs.wisc.edu/~vshree/xfs.pdf too

(I'm finding that old message with Ted saying "Making sure you don't lose data is Job #1" hilarious right now, consider the recent ext4 data loss debacle)

also note that the message from Ted was back in 2004, there has been a _lot_ of work done on XFS in the last 4 years.

as for the second link, that focuses on what happens to the filesystem if the disk under it starts returning errors or garbage. with the _possible_ exception of ZFS, every filesystem around will do strange things under those conditions. and in my option, the way to deal with this sort of thing isn't to move to ZFS to detect the problem, it's to setup redundancy in your storage so that you can not only detect the problem, but correct it as well (it's a good thing to know that your database file is corrupt, but that's not nearly as useful as having some way to recover the data that was there)

David Lang

---------- Forwarded message ----------
Date: Fri, 3 Apr 2009 10:19:38 +0200
From: Hannes Dorbath <light@xxxxxxxxxxxxxxxxxxxx>
Newsgroups: pgsql.performance
Subject: Re:  Raid 10 chunksize

Ron Mayer wrote:
Greg Smith wrote:
On Wed, 1 Apr 2009, Scott Carey wrote:

Write caching on SATA is totally fine.  There were some old ATA drives
that when paried with some file systems or OS's would not be safe. There are
some combinations that have unsafe write barriers.  But there is a
standard
well supported ATA command to sync and only return after the data is on
disk.  If you are running an OS that is anything recent at all, and any
disks that are not really old, you're fine.
While I would like to believe this, I don't trust any claims in this
area that don't have matching tests that demonstrate things working as
expected.  And I've never seen this work.

My laptop has a 7200 RPM drive, which means that if fsync is being
passed through to the disk correctly I can only fsync <120
times/second.  Here's what I get when I run sysbench on it, starting
with the default ext3 configuration:

I believe it's ext3 who's cheating in this scenario.

I assume so too. Here the same test using XFS, first with barriers (XFS default) and then without:

Linux 2.6.28-gentoo-r2 #1 SMP Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz GenuineIntel GNU/Linux

/dev/sdb /data2 xfs rw,noatime,attr2,logbufs=8,logbsize=256k,noquota 0 0

# sysbench --test=fileio --file-fsync-freq=1 --file-num=1 --file-total-size=16384 --file-test-mode=rndwr run
sysbench 0.4.10:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1

Extra file open flags: 0
1 files, 16Kb each
16Kb total file size
Block size 16Kb
Number of random requests for random IO: 10000
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 1 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random write test
Threads started!
Done.

Operations performed:  0 Read, 10000 Write, 10000 Other = 20000 Total
Read 0b  Written 156.25Mb  Total transferred 156.25Mb  (463.9Kb/sec)
  28.99 Requests/sec executed

Test execution summary:
   total time:                          344.9013s
   total number of events:              10000
   total time taken by event execution: 0.1453
   per-request statistics:
        min:                                  0.01ms
        avg:                                  0.01ms
        max:                                  0.07ms
        approx.  95 percentile:               0.01ms

Threads fairness:
   events (avg/stddev):           10000.0000/0.00
   execution time (avg/stddev):   0.1453/0.00


And now without barriers:

/dev/sdb /data2 xfs rw,noatime,attr2,nobarrier,logbufs=8,logbsize=256k,noquota 0 0

# sysbench --test=fileio --file-fsync-freq=1 --file-num=1 --file-total-size=16384 --file-test-mode=rndwr run
sysbench 0.4.10:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1

Extra file open flags: 0
1 files, 16Kb each
16Kb total file size
Block size 16Kb
Number of random requests for random IO: 10000
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 1 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random write test
Threads started!
Done.

Operations performed:  0 Read, 10000 Write, 10000 Other = 20000 Total
Read 0b  Written 156.25Mb  Total transferred 156.25Mb  (62.872Mb/sec)
4023.81 Requests/sec executed

Test execution summary:
   total time:                          2.4852s
   total number of events:              10000
   total time taken by event execution: 0.1325
   per-request statistics:
        min:                                  0.01ms
        avg:                                  0.01ms
        max:                                  0.06ms
        approx.  95 percentile:               0.01ms

Threads fairness:
   events (avg/stddev):           10000.0000/0.00
   execution time (avg/stddev):   0.1325/0.00




--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux