Re: TB-sized databases

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Tom Lane wrote:
Ron Mayer <rm_pg@xxxxxxxxxxxxxxxxxxxxxxx> writes:
Would another possible condition for considering
Cartesian joins be be:

   * Consider Cartesian joins when a unique constraint can prove
     that at most one row will be pulled from one of the tables
     that would be part of this join?

What for?  That would still lead us to consider large numbers of totally
useless joins.

			regards, tom lane

Often I get order-of-magnitude better queries by forcing the cartesian
join even without multi-column indexes.

Explain analyze results below.



Here's an example with your typical star schema.
  fact is the central fact table.
  d_ref is a dimension table for the referrer
  d_uag is a dimension table for the useragent.

Forcing the cartesan join using "offset 0" makes
the the query take 14 ms (estimated cost 7575).

If I don't force the cartesian join the query takes
over 100ms (estimated cost 398919).

Indexes are on each dimension; but no multi-column
indexes (since the ad-hoc queries can hit any permutation
of dimensions).

logs=# explain analyze select * from fact natural join (select * from d_ref natural join d_uag where ref_host = 'www.real.com' and ref_path = '/products/player/more_info/moreinfo.html' and ref_query = '?ID=370&DC=&LANG=&PN=RealOne%20Player&PV=6.0.11.818&PT=&OS=&CM=&CMV=&LS=&RE=&RA=&RV=' and useragent = 'Mozilla/4.08 [en] (WinNT; U ;Nav)' offset 0 ) as a;
                                                                                                                                  QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=6465.12..7575.91 rows=367 width=2096) (actual time=14.152..14.192 rows=4 loops=1)
   ->  Limit  (cost=0.00..14.22 rows=1 width=218) (actual time=0.084..0.102 rows=1 loops=1)
         ->  Nested Loop  (cost=0.00..14.22 rows=1 width=218) (actual time=0.082..0.096 rows=1 loops=1)
               ->  Index Scan using i_ref__val on d_ref  (cost=0.00..7.83 rows=1 width=127) (actual time=0.056..0.058 rows=1 loops=1)
                     Index Cond: (((ref_path)::text = '/products/player/more_info/moreinfo.html'::text) AND ((ref_host)::text = 'www.real.com'::text) AND ((ref_query)::text = '?ID=370&DC=&LANG=&PN=RealOne%20Player&PV=6.0.11.818&PT=&OS=&CM=&CMV=&LS=&RE=&RA=&RV='::text))
               ->  Index Scan using i_uag__val on d_uag  (cost=0.00..6.38 rows=1 width=91) (actual time=0.020..0.029 rows=1 loops=1)
                     Index Cond: ((useragent)::text = 'Mozilla/4.08 [en] (WinNT; U ;Nav)'::text)
   ->  Bitmap Heap Scan on fact  (cost=6465.12..7556.18 rows=367 width=32) (actual time=14.053..14.066 rows=4 loops=1)
         Recheck Cond: ((fact.uag_id = a.uag_id) AND (fact.ref_id = a.ref_id))
         ->  BitmapAnd  (cost=6465.12..6465.12 rows=367 width=0) (actual time=14.016..14.016 rows=0 loops=1)
               ->  Bitmap Index Scan on i__fact__uag_id  (cost=0.00..2770.83 rows=196223 width=0) (actual time=2.258..2.258 rows=7960 loops=1)
                     Index Cond: (fact.uag_id = a.uag_id)
               ->  Bitmap Index Scan on i__fact__ref_id  (cost=0.00..3581.50 rows=253913 width=0) (actual time=9.960..9.960 rows=13751 loops=1)
                     Index Cond: (fact.ref_id = a.ref_id)
 Total runtime: 14.332 ms
(15 rows)

logs=#



logs=# explain analyze select * from fact natural join (select * from d_ref natural join d_uag where ref_host = 'www.real.com' and ref_path = '/products/player/more_info/moreinfo.html' and ref_query = '?ID=370&DC=&LANG=&PN=RealOne%20Player&PV=6.0.11.818&PT=&OS=&CM=&CMV=&LS=&RE=&RA=&RV=' and useragent = 'Mozilla/4.08 [en] (WinNT; U ;Nav)' ) as a;
                                                                                                                               QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=2827.72..398919.05 rows=1 width=242) (actual time=78.777..107.038 rows=4 loops=1)
   Hash Cond: (fact.ref_id = d_ref.ref_id)
   ->  Nested Loop  (cost=2819.88..398908.65 rows=511 width=119) (actual time=6.311..101.843 rows=7960 loops=1)
         ->  Index Scan using i_uag__val on d_uag  (cost=0.00..6.38 rows=1 width=91) (actual time=0.021..0.029 rows=1 loops=1)
               Index Cond: ((useragent)::text = 'Mozilla/4.08 [en] (WinNT; U ;Nav)'::text)
         ->  Bitmap Heap Scan on fact  (cost=2819.88..396449.49 rows=196223 width=32) (actual time=6.273..91.645 rows=7960 loops=1)
               Recheck Cond: (fact.uag_id = d_uag.uag_id)
               ->  Bitmap Index Scan on i__fact__uag_id  (cost=0.00..2770.83 rows=196223 width=0) (actual time=5.117..5.117 rows=7960 loops=1)
                     Index Cond: (fact.uag_id = d_uag.uag_id)
   ->  Hash  (cost=7.83..7.83 rows=1 width=127) (actual time=0.069..0.069 rows=1 loops=1)
         ->  Index Scan using i_ref__val on d_ref  (cost=0.00..7.83 rows=1 width=127) (actual time=0.059..0.062 rows=1 loops=1)
               Index Cond: (((ref_path)::text = '/products/player/more_info/moreinfo.html'::text) AND ((ref_host)::text = 'www.real.com'::text) AND ((ref_query)::text = '?ID=370&DC=&LANG=&PN=RealOne%20Player&PV=6.0.11.818&PT=&OS=&CM=&CMV=&LS=&RE=&RA=&RV='::text))
 Total runtime: 107.193 ms
(13 rows)

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux