GROUP BY with reasonable timings in PLAN but unreasonable execution time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have a query which seems to be taking an extraordinarily long time (many minutes, at least) when seemingly equivalent queries have different plans and execute in seconds. naturally, I'd like to know why.

Version is Postgresql 8.4.8. The table, "t", is

 Column |  Type   | Modifiers
--------+---------+-----------
 y      | integer | not null
 x      | integer | not null
 k      | integer | not null
 j      | integer | not null
 z      | integer | not null
Indexes:
    "t_pkey" PRIMARY KEY, btree (j, k, x, y, z)

The table population, in pseudocode, is this:
  for x in 0..9
    for y in 0..9999
      for z in 0..29
        INSERT INTO t VALUES(y,x,0,0,z)

So the table has 300000 entries, with j and k always 0.

The query is:

 SELECT *
   FROM (
    SELECT * FROM t GROUP BY j,k,x,z,y
   ) AS f
   NATURAL JOIN t;

The plan:

 Merge Join  (cost=44508.90..66677.96 rows=1 width=20)
  Merge Cond: ((public.t.j = public.t.j) AND (public.t.k = public.t.k)
               AND (public.t.x = public.t.x))
  Join Filter: ((public.t.y = public.t.y) AND (public.t.z = public.t.z))
  -> Group (cost=44508.90..49008.90 rows=30000 width=20)
     ->  Sort (cost=44508.90..45258.90 rows=300000 width=20)
        Sort Key: public.t.j, public.t.k, public.t.x, public.t.z,
                  public.t.y
        ->  Seq Scan on t  (cost=0.00..4911.00 rows=300000 width=20)
  ->  Index Scan using t_pkey on t  (cost=0.00..14877.18 rows=300000
                                     width=20)

This query runs at least 20 minutes, with postmaster CPU utilization at 99%, without completing. System is a 3.2GHz Zeon, 3GB memory, and not much else running.

By contrast, placing an intermediate result in a table "u" provides a result in about 3 seconds:

 CREATE TEMPORARY TABLE u AS SELECT * FROM t GROUP BY j,k,x,z,y;
 SELECT * FROM u NATURAL JOIN t;

Changing the order of the GROUP BY clause varies the plan, sometimes yielding shorter execution times. For example, this ordering executes in about 1.5 seconds:

 SELECT *
   FROM (
    SELECT * FROM t GROUP BY j,k,x,y,z
   ) AS f
   NATURAL JOIN t;

With 120 permutations, I didn't try them all.

I should note that the plans tend to have similar costs, so the query planner presumably does not know that some permutations have significantly greater execution times.

Clem Dickey

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux