slow loading of pages for SELECT query - will CLUSTER help?

Sev Zaslavsky <sevzas@xxxxxxxxx> · Thu, 12 Dec 2013 16:08:48 -0500



    Hello,

     
    I've got a very simple table with a very simple SELECT query, but it
    takes longer on the initial run than I'd like, so I want to see if
    there is a strategy to optimize this.

    
    Table rt_h_nbbo contains several hundred million rows.  All rows for
    a given entry_date are appended to this table in an overnight
    process every night - on the order of several million rows each
    night.

    
    The objective is to select all of the rows for a given product_id on
    a given entry_date.

    
    There is a b-tree index on (product_id, entry_date) called
    rt_h_nbbo_idx. The index appears to be used correctly.  I'm seeing
    that if the data pages are not in memory, nearly all of the time is
    spent on disk I/O.  The
     first time, the query takes 21 sec.  If I run this query a
    second time, it completes in approx 1-2 ms.

    
    Running select  pg_relation_size( 'rt_h_nbbo') /  reltuples FROM
    pg_class WHERE relname = 'rt_h_nbbo'; yields roughly 135 bytes/row.

    
    I perceive an inefficiency here and I'd like your input as to how to
    deal with it: The end result of the query is 1631 rows which is
    approx 220 kb of data (at 135 bytes/row).  Compare that to the
    amount of I/O that was done: 1634 buffers were loaded, 8 kb per
    buffer - that's about 13Mb  of data!  Query completed in 21 sec.  

    
    So 13Mb of data was read from disk, but only 220Kb was useful -
    about 1.7%.  I'd like to make this work faster and hopefully more
    efficiently.

    
    First question is: Does loading 13Mb of data in 21 sec seem
    kinda slow or about right (hardware specs at bottom of email)?

    
    Second question: Perhaps I can reduce the number of pages
    that contain the data I want by physically storing the data in such
    a way that it parallels the rt_h_nbbo_idx index?  I recall you can
    do this in Sybase with a CLUSTERED index.  The answer for Postgresql
    seems to be "yes, use the CLUSTER command".  But this command does a
    one-time clustering and requires periodic re-clustering.  Is this
    the best approach?  Are there considerations with respect to the
    type of index (B-tree, GIST, SP-GIST) being used for CLUSTER ?

    
    Thanks

    
    -Sev

    
    Table (this is a fairly large table - hundreds of millions of
      rows):

    
    CREATE TABLE rt_h_nbbo

    (

      product_id integer NOT NULL,

      bid_price double precision NOT NULL DEFAULT 0.0,

      bid_size integer NOT NULL DEFAULT 0,

      ask_price double precision NOT NULL DEFAULT 0.0,

      ask_size integer NOT NULL DEFAULT 0,

      last_price double precision NOT NULL DEFAULT 0.0,

      entry_date date NOT NULL,

      entry_time time without time zone NOT NULL,

      event_time time without time zone NOT NULL,

      day_volume bigint NOT NULL DEFAULT 0,

      day_trade_ct integer,

      entry_id bigint NOT NULL,

      CONSTRAINT rt_h_nbbo_pkey PRIMARY KEY (entry_id),

      CONSTRAINT rt_h_nbbo_pfkey FOREIGN KEY (product_id)

          REFERENCES product (product_id) MATCH SIMPLE

          ON UPDATE NO ACTION ON DELETE NO ACTION

    )

    WITH (

      OIDS=FALSE

    );

    ALTER TABLE rt_h_nbbo

      OWNER TO postgres;

    
    Index:

    
    CREATE INDEX rt_h_nbbo_idx

      ON rt_h_nbbo

      USING btree

      (product_id, entry_date DESC);

    
    Test:

    
    SET track_io_timing = on; 

    EXPLAIN (ANALYZE,BUFFERS,VERBOSE,COSTS,TIMING) select * from
    rt_h_nbbo where product_id=6508 and entry_date='2013-11-26';

    
    Output:

    
    "Index Scan using rt_h_nbbo_idx on public.rt_h_nbbo 
    (cost=0.00..12768.21 rows=3165 width=76) (actual
    time=12.549..21654.547 rows=1631 loops=1)"

    "  Output: product_id, bid_price, bid_size, ask_price, ask_size,
    last_price, entry_date, entry_time, event_time, day_volume,
    day_trade_ct, entry_id"

    "  Index Cond: ((rt_h_nbbo.product_id = 6508) AND
    (rt_h_nbbo.entry_date = '2013-11-26'::date))"

    "  Buffers: shared hit=4 read=1634"

    "  I/O Timings: read=21645.468"

    "Total runtime: 21655.002 ms"

    
    Hardware

    
    Top of the line HP DL380 G7 server with 32 Gb Ram,  P410i RAID, 10K
    SAS drives in Raid-1 config.  Wal on separate Raid-1 volume with 15K
    SAS drives.

    The only unusual thing here is that I'm running on Windows Server
    2008 R2.