On Fri, 28 Mar 2008, Laszlo Nagy wrote:
We already have a server but it is becoming slow and we would like to have something that is faster.
What's it slow at? Have you identified the bottlenecks and current sources of sluggish behavior? That sort of thing is much more informative to look into in regards to redesigning for new hardware than trivia like disk layout. For all we know you're CPU bound.
The database itself is an OLTP system. There are many smaller tables, and some bigger ones (biggest table with 1.2 million records, table size 966MB, indexes size 790MB).
The total database size is the interesting number you left out here. And you didn't mention how much RAM either. That ratio has a lot of impact on how hard you'll push the disks.
Question 1. We are going to use PostgreSQL 3.1 with FreeBSD. The pg docs say that it is better to use FreeBSD because it can alter the I/O priority of processes dynamically.
You shouldn't make an OS decision based on a technical detail that small. I won't knock FreeBSD because it's a completely reasonable choice, but there's no credible evidence it's a better performer for the workload you expect than, say, Linux or even Solaris x64. (The benchmarks the FreeBSD team posted as part of their 7.0 fanfare are not representative of real PostgreSQL performance, and are read-only as well).
All the reasonable OS choices here are close enough to one another (as long as you get FreeBSD 7, earlier versions are really slow) that you should be thinking in terms of reliability, support, and features rather than viewing this from a narrow performance perspective. There's nothing about what you've described that sounds like it needs bleeding-edge performance to achieve. For reliability, I first look at how good the disk controller and its matching driver in the OS used is, which brings us to:
Question 2. SCSI or SATA? I plan to buy a RocketRAID 3520 controller with 8 SATA 2 disks. The operating system would be on another disk pair, connected to the motherboard's controller. I wonder if I can get more performance with SCSI, for the same amount of money? (I can spend about $1500 on the controller and the disks, that would cover 10 SATA 2 disks and the controller.)
Highpoint has traditionally made disk controllers that were garbage. The 3520 is from a relatively new series of products from them, and it seems like a reasonable unit. However: do you want to be be deploying your system on a new card with zero track record for reliability, and from a company that has never done a good job before? I can't think of any reason at all why you should take that risk.
The standard SATA RAID controller choices people suggest here are 3ware, Areca, and LSI Logic. Again, unless you're really pushing what the hardware is capable of these are all close to each other performance-wise (see http://femme.tweakblogs.net/blog/196/highpoint-rocketraid-3220.html for something that include the Highpoint card). You should be thinking in terms of known reliability and stability when you select a database controller card, and Highpoint isn't even on the list of vendors to consider yet by those standards.
As for SCSI vs. SATA, I collected up the usual arguments on both sides at http://www.postgresqldocs.org/index.php/SCSI_vs._IDE/SATA_Disks
However, the transaction log file should be on a separate disk and maybe I could gain more performance by putting indexes on a separate drive, but I do not want to reduce the number of disks in the RAID 0+1 array.
If you're looking at 8+ disks and have a caching controller with a battery backup, which appears to be your target configuration, there little reason to expect a big performance improvement from splitting the transaction log out onto a seperate disk. As you note, doing that will reduce the spread of disk for the database which may cost you more in performance than seperate transaction logs gain.
It is worth considering creating a seperate filesystem on the big array to hold the xlog data through, because that gives you more flexibility in terms of mount parameters there. For example, you can always turn off atime updates on the transaction log filesystem, and in many cases the filesystem journal updates can be optimized more usefully (the xlog doesn't require them).
-- * Greg Smith gsmith@xxxxxxxxxxxxx http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance