Re: Reliability recommendations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



At 11:41 PM 2/24/2006, Luke Lonergan wrote:
Dan,

On 2/24/06 4:47 PM, "Dan Gorman" <dgorman@xxxxxxx> wrote:

> Was that sequential reads? If so, yeah you'll get 110MB/s? How big
> was the datafile size? 8MB? Yeah, you'll get 110MB/s. 2GB? No, they
> can't sustain that. There are so many details missing from this test
> that it's hard to have any context around it :)
>
> I was getting about 40-50MB/s on a PV with 14 disks on a RAID10 in
> real world usage. (random IO and fully saturating a Dell 1850 with 4
> concurrent threads (to peg the cpu on selects) and raw data files)

OK, how about some proof?

In a synthetic test that writes 32GB of sequential 8k pages on a machine
with 16GB of RAM:
========================= Write test results ==============================
time bash -c "dd if=/dev/zero of=/dbfast1/llonergan/bigfile bs=8k
count=2000000 && sync" &
time bash -c "dd if=/dev/zero of=/dbfast3/llonergan/bigfile bs=8k
count=2000000 && sync" &

2000000++0 records in
2000000++0 records out
2000000++0 records in
2000000++0 records out

real    1m0.046s
user    0m0.270s
sys     0m30.008s

real    1m0.047s
user    0m0.287s
sys     0m30.675s

So that's 32,000 MB written in 60.05 seconds, which is 533MB/s sustained
with two threads.

Now to read the same files in parallel:
========================= Read test results ==============================
sync
time dd of=/dev/null if=/dbfast1/llonergan/bigfile bs=8k &
time dd of=/dev/null if=/dbfast3/llonergan/bigfile bs=8k &

2000000++0 records in
2000000++0 records out

real    0m39.849s
user    0m0.282s
sys     0m22.294s
2000000++0 records in
2000000++0 records out

real    0m40.410s
user    0m0.251s
sys     0m22.515s

And that's 32,000MB in 40.4 seconds, or 792MB/s sustained from disk (not
memory).

These are each RAID5 arrays of 8 internal SATA disks on 3Ware HW RAID
controllers.

Impressive IO rates.  A more detailed HW list would help put them in context.

Which 3Ware? The 9550SX? How much cache on it (AFAIK, the only options are 128MB and 256MB?)?

Which HDs?

What CPUs (looks like Opterons, but which flavor?) and mainboard?

What's CPU utilization when hammering the physical IO subsystem this hard?


Now for real usage, let's run a simple sequential scan query on 123,434 MB
of data in a single table on 4 of these machines in parallel.  All tables
are distributed evenly by Bizgres MPP over all 8 filesystems:

============= Bizgres MPP sequential scan results =========================

[llonergan@salerno0 +AH4]$ !psql
psql -p 9999 -U mppdemo1 demo
Welcome to psql 8.1.1 (server 8.1.3), the PostgreSQL interactive terminal.

Type:  +AFw-copyright for distribution terms
       +AFw-h for help with SQL commands
       +AFw? for help with psql commands
       +AFw-g or terminate with semicolon to execute query
       +AFw-q to quit

demo=# +AFw-timing
Timing is on.
demo=# select version();

version
----------------------------------------------------------------------------
----------------------------------------------------------------------------
-----
 PostgreSQL 8.1.3 (Bizgres MPP 2.1) on x86_64-unknown-linux-gnu, compiled by
GCC gcc (GCC) 3.4.4 20050721 (Red Hat 3.4.4-2) compiled on Feb 23 2006
11:34:06
(1 row)

Time: 0.570 ms
demo=# select relname,8*relpages/128 as MB from pg_class order by relpages
desc limit 6;
            relname             |   mb
--------------------------------++--------
 lineitem                       | 123434
 orders                         |  24907
 partsupp                       |  14785
 part                           |   3997
 customer                       |   3293
 supplier                       |    202
(6 rows)

Time: 1.824 ms
demo=# select count(*) from lineitem;
   count
-----------
 600037902
(1 row)

Time: 60300.960 ms

So that's 123,434 MB of data scanned in 60.3 seconds, or 2,047 MB/s on 4
machines, which uses 512MB/s of disk bandwidth on each machine.

Now let's do a query that uses a this big table (a two way join) using all 4
machines:
============= Bizgres MPP Query results =========================
demo=# select
demo-#         sum(l_extendedprice* (1 - l_discount)) as revenue
demo-# from
demo-#         lineitem,
demo-#         part
demo-# where
demo-#         (
demo(#                 p_partkey = l_partkey
demo(#                 and p_brand = 'Brand#42'
demo(#                 and p_container in ('SM CASE', 'SM BOX', 'SM PACK',
'SM PKG')
demo(#                 and l_quantity >= 7 and l_quantity <= 7 ++ 10
demo(#                 and p_size between 1 and 5
demo(#                 and l_shipmode in ('AIR', 'AIR REG')
demo(#                 and l_shipinstruct = 'DELIVER IN PERSON'
demo(#         )
demo-#         or
demo-#         (
demo(#                 p_partkey = l_partkey
demo(#                 and p_brand = 'Brand#15'
demo(#                 and p_container in ('MED BAG', 'MED BOX', 'MED PKG',
'MED PACK')
demo(#                 and l_quantity >= 14 and l_quantity <= 14 ++ 10
demo(#                 and p_size between 1 and 10
demo(#                 and l_shipmode in ('AIR', 'AIR REG')
demo(#                 and l_shipinstruct = 'DELIVER IN PERSON'
demo(#         )
demo-#         or
demo-#         (
demo(#                 p_partkey = l_partkey
demo(#                 and p_brand = 'Brand#53'
demo(#                 and p_container in ('LG CASE', 'LG BOX', 'LG PACK',
'LG PKG')
demo(#                 and l_quantity >= 22 and l_quantity <= 22 ++ 10
demo(#                 and p_size between 1 and 15
demo(#                 and l_shipmode in ('AIR', 'AIR REG')
demo(#                 and l_shipinstruct = 'DELIVER IN PERSON'
demo(#         );
    revenue
----------------
 356492404.3164
(1 row)

Time: 114908.149 ms
Hmmm.  ~115secs @ ~500MBps => ~57.5GB of data manipulated.


And now a 6-way join among 4 tables in this same schema:

demo=# SELECT
demo-#
s.s_acctbal,s.s_name,n.n_name,p.p_partkey,p.p_mfgr,s.s_address,s.s_phone,s.s
_comment
demo-# FROM
demo-#         supplier s,partsupp ps,nation n,region r,
demo-#         part p,  (
demo(#                         SELECT p_partkey,  min(ps_supplycost) as
min_ps_cost from part, partsupp ,
demo(#                                 supplier,nation, region
demo(#                         WHERE
demo(#                                 p_partkey=ps_partkey
demo(#                                 and s_suppkey = ps_suppkey
demo(#                                 and s_nationkey = n_nationkey
demo(#                                 and n_regionkey = r_regionkey
demo(#                                 and r_name = 'EUROPE'
demo(#                         GROUP BY
demo(#                                 p_partkey
demo(#                  ) g
demo-# WHERE
demo-#         p.p_partkey = ps.ps_partkey
demo-#         and g.p_partkey = p.p_partkey
demo-#         and g. min_ps_cost = ps.ps_supplycost
demo-#         and s.s_suppkey = ps.ps_suppkey
demo-#         and p.p_size = 15
demo-#         and p.p_type like '%BRASS'
demo-#         and s.s_nationkey = n.n_nationkey
demo-#         and n.n_regionkey = r.r_regionkey
demo-#         and r.r_name = 'EUROPE'
demo-# ORDER BY
demo-#         s. s_acctbal desc,n.n_name,s.s_name,p.p_partkey
demo-# LIMIT 100;
 s_acctbal |          s_name           |          n_name           |
p_partkey |          p_mfgr           |
          s_address                 |     s_phone     |
s_comment

-----------++---------------------------++---------------------------++--------
---++---------------------------++------
------------------------------------++-----------------++---------------------
--------------------------------------
-------------------------------------------
   9999.70 | Supplier#000239544        | UNITED KINGDOM            |
6739531 | Manufacturer#4            | 1UCMu
3TLyUThghoeZ8arg6cV3Mr              | 33-509-584-9496 | carefully ironic
asymptotes cajole quickly. slyly silent a
ccounts sleep. fl
...
...
   9975.53 | Supplier#000310136        | ROMANIA                   |
10810115 | Manufacturer#5            | VNWON
A5Sr B                              | 29-977-903-6199 | pending deposits
wake permanently; final accounts sleep ab
out the pending deposits.
(100 rows)

Time: 424981.813 ms
...and this implies ~425secs @ ~500MBps => 212.5GB

What are the IO rates during these joins?

How much data is being handled to complete these joins?

How much data is being exchanged between these machines to complete the joins?

What is the connectivity between these 4 machines?

Putting these numbers in context may help the advocacy effort considerably as well as help us improve things even further. ;-)

TiA,
Ron



[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux