On Thu, May 23, 2013 at 1:56 AM, Andrea Suisani <sickpig@xxxxxxxxxxxx> wrote: > On 05/22/2013 03:30 PM, Merlin Moncure wrote: >> >> On Tue, May 21, 2013 at 7:19 PM, Greg Smith <greg@xxxxxxxxxxxxxxx> wrote: >>> >>> On 5/20/13 6:32 PM, Merlin Moncure wrote: > > > [cut] > > >>> The only really huge gain to be had using SSD is commit rate at a low >>> client >>> count. There you can easily do 5,000/second instead of a spinning disk >>> that >>> is closer to 100, for less than what the battery-backed RAID card along >>> costs to speed up mechanical drives. My test server's 100GB DC S3700 was >>> $250. That's still not two orders of magnitude faster though. >> >> >> That's most certainly *not* the only gain to be had: random read rates >> of large databases (a very important metric for data analysis) can >> easily hit 20k tps. So I'll stand by the figure. Another point: that >> 5000k commit raid is sustained, whereas a raid card will spectacularly >> degrade until the cache overflows; it's not fair to compare burst with >> sustained performance. To hit 5000k sustained commit rate along with >> good random read performance, you'd need a very expensive storage >> system. Right now I'm working (not by choice) with a teir-1 storage >> system (let's just say it rhymes with 'weefax') and I would trade it >> for direct attached SSD in a heartbeat. >> >> Also, note that 3rd party benchmarking is showing the 3700 completely >> smoking the 710 in database workloads (for example, see >> http://www.anandtech.com/show/6433/intel-ssd-dc-s3700-200gb-review/6). > > > [cut] > > Sorry for interrupting but on a related note I would like to know your > opinions on what the anandtech review said about 3700 poor performance > on "Oracle Swingbench", quoting the relevant part that you can find here (*) > > <quote> > > [..] There are two components to the Swingbench test we're running here: > the database itself, and the redo log. The redo log stores all changes that > are made to the database, which allows the database to be reconstructed in > the event of a failure. In good DB design, these two would exist on separate > storage systems, but in order to increase IO we combined them both for this > test. > Accesses to the DB end up being 8KB and random in nature, a definite strong > suit > of the S3700 as we've already shown. The redo log however consists of a > bunch > of 1KB - 1.5KB, QD1, sequential accesses. The S3700, like many of the newer > controllers we've tested, isn't optimized for low queue depth, sub-4KB, > sequential > workloads like this. [..] > > </quote> > > Does this kind of scenario apply to postgresql wal files repo ? huh -- I don't think so. wal file segments are 8kb aligned, ditto clog, etc. In XLogWrite(): /* OK to write the page(s) */ from = XLogCtl->pages + startidx * (Size) XLOG_BLCKSZ; nbytes = npages * (Size) XLOG_BLCKSZ; <-- errno = 0; if (write(openLogFile, from, nbytes) != nbytes) { AFICT, that's the only way you write out xlog. One thing I would definitely advise though is to disable partial page writes if it's enabled. s3700 is algined on 8kb blocks internally -- hm. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance