Re: Hardware/OS recommendations for large databases (

Alex Turner <armtuk@xxxxxxxxx> · Fri, 18 Nov 2005 11:28:40 -0500

Ok - so I ran the same test on my system and get a total speed of113MB/sec.  Why is this?  Why is the system so limited to around just110MB/sec?  I tuned read ahead up a bit, and my results improve abit..
Alex

On 11/18/05, Luke Lonergan <llonergan@xxxxxxxxxxxxx> wrote:>  Dave,>>  On 11/18/05 5:00 AM, "Dave Cramer" <pg@xxxxxxxxxxxxx> wrote:>  >>  > Now there's an interesting line drawn in the sand. I presume you have>  > numbers to back this up ?>  >>  > This should draw some interesting posts.>>  Part 2: The answer>>  System A:>> This system is running RedHat 3 Update 4, with a Fedora 2.6.10 Linux kernel.>>  On a single table with 15 columns (the Bizgres IVP) at a size double memory> (2.12GB), Postgres 8.0.3 with Bizgres enhancements takes 32 seconds to scan> the table: that's 66 MB/s.  Not the efficiency I'd hope from the onboard> SATA controller that I'd like, I would have expected to get 85% of the> 100MB/s raw read performance.>>  So that's $1,200 / 66 MB/s (without adjusting for 2003 price versus now) => 18.2 $/MB/s>>  Raw data:>  [llonergan@kite4 IVP]$ cat scan.sh>  #!/bin/bash>>  time psql -c "select count(*) from ivp.bigtable1" dgtestdb>  [llonergan@kite4 IVP]$ cat sysout1>    count>  ---------->   10000000>  (1 row)>>>  real    0m32.565s>  user    0m0.002s>  sys     0m0.003s>>  Size of the table data:>  [llonergan@kite4 IVP]$ du -sk dgtestdb/base>  2121648 dgtestdb/base>>  System B:>> This system is running an XFS filesystem, and has been tuned to use very> large (16MB) readahead.  It's running the Centos 4.1 distro, which uses a> Linux 2.6.9 kernel.>>  Same test as above, but with 17GB of data takes 69.7 seconds to scan (!)> That's 244.2MB/s, which is obviously double my earlier point of 110-120MB/s.>  This system is running with a 16MB Linux readahead setting, let's try it> with the default (I think) setting of 256KB – AHA! Now we get 171.4 seconds> or 99.3MB/s.>>  So, using the tuned setting of "blockdev —setra 16384" we get $6,000 /> 244MB/s = 24.6 $/MB/s>  If we use the default Linux setting it's 2.5x worse.>>  Raw data:>  [llonergan@modena2 IVP]$ cat scan.sh>  #!/bin/bash>>  time psql -c "select count(*) from ivp.bigtable1" dgtestdb>  [llonergan@modena2 IVP]$ cat sysout3>    count>  ---------->   80000000>  (1 row)>>>  real    1m9.875s>  user    0m0.000s>  sys     0m0.004s>  [llonergan@modena2 IVP]$ !du>  du -sk dgtestdb/base>  17021260        dgtestdb/base>>  Summary:>>  <cough, cough> OK – you can get more I/O bandwidth out of the current I/O> path for sequential scan if you tune the filesystem for large readahead.> This is a cheap alternative to overhauling the executor to use asynch I/O.>>  Still, there is a CPU limit here – this is not I/O bound, it is CPU limited> as evidenced by the sensitivity to readahead settings.   If the filesystem> could do 1GB/s, you wouldn't go any faster than 244MB/s.>>  - Luke
---------------------------(end of broadcast)---------------------------TIP 2: Don't 'kill -9' the postmaster