Weird index or sort behaviour

Matthew Wakeling <matthew@xxxxxxxxxxx> · Tue, 18 Aug 2009 14:20:21 +0100 (BST)

I'm seeing some interesting behaviour. I'm executing a query where I 
perform a merge join between two copies of the same table, completely 
symmetrically, and the two sides of the merge are sourced differently.

SELECT COUNT(*)
FROM
    (SELECT DISTINCT
        l1.objectid,
        l1.id AS id1,
        l1.intermine_start AS start1,
        l1.intermine_end AS end1,
        l2.id AS id2,
        l2.intermine_start AS start2,
        l2.intermine_end AS end2
    FROM
        locationbin8000 l1,
        locationbin8000 l2
    WHERE
            l1.subjecttype = 'GeneFlankingRegion'
        AND l2.subjecttype = 'GeneFlankingRegion'
        AND l1.objectid = l2.objectid
        AND l1.bin = l2.bin
    ) AS a
WHERE
        start1 <= end2
    AND start2 <= end1;

                      QUERY PLAN
---------------------------------------------------------
 Aggregate
   (cost=703459.72..703459.73 rows=1 width=0)
   (actual time=43673.526..43673.527 rows=1 loops=1)
   ->  HashAggregate
         (cost=657324.23..677828.89 rows=2050466 width=28)
         (actual time=33741.380..42187.885 rows=17564726 loops=1)
         ->  Merge Join
               (cost=130771.22..621441.07 rows=2050466 width=28)
               (actual time=456.970..15292.997 rows=21463106 loops=1)
               Merge Cond: ((l1.objectid = l2.objectid) AND (l1.bin = l2.bin))
               Join Filter: ((l1.intermine_start <= l2.intermine_end) AND (l2.intermine_start <= l1.intermine_end))
               ->  Index Scan using locationbin8000__subjectobjectbin on locationbin8000 l1
                     (cost=0.00..72096.78 rows=670733 width=20)
                     (actual time=0.085..345.834 rows=664588 loops=1)
                     Index Cond: (subjecttype = 'GeneFlankingRegion'::text)
               ->  Sort
                     (cost=130771.22..132448.05 rows=670733 width=20)
                     (actual time=456.864..3182.638 rows=38231659 loops=1)
                     Sort Key: l2.objectid, l2.bin
                     Sort Method:  quicksort  Memory: 81690kB
                     ->  Bitmap Heap Scan on locationbin8000 l2
                           (cost=12706.60..65859.76 rows=670733 width=20)
                           (actual time=107.259..271.026 rows=664588 loops=1)
                           Recheck Cond: (subjecttype = 'GeneFlankingRegion'::text)
                           ->  Bitmap Index Scan on locationbin8000__subjecttypeid
                                 (cost=0.00..12538.92 rows=670733 width=0)
                                 (actual time=106.327..106.327 rows=664588 loops=1)
                                 Index Cond: (subjecttype = 'GeneFlankingRegion'::text)
 Total runtime: 44699.675 ms
(15 rows)

Here is the definition of the locationbin8000 table:

    Table "public.locationbin8000"
     Column      |  Type   | Modifiers
-----------------+---------+-----------
 id              | integer |
 objectid        | integer |
 intermine_start | integer |
 intermine_end   | integer |
 subjecttype     | text    |
 bin             | integer |
Indexes:
    "locationbin8000__subjectobjectbin" btree (subjecttype, objectid, bin)
    "locationbin8000__subjecttypeid" btree (subjecttype, id)

The table is clustered on the locationbin8000__subjectobjectbin index, and 
has been analysed.

So you can see, the merge join requires two inputs both ordered by 
(objectid, bin), which is readily supplied by the 
locationbin8000__subjectobjectbin index, given that I am restricting the 
subjecttype of both sides (to the same thing, I might add). Therefore, I 
would expect the merge join to feed off two identical index scans. This is 
what happens for one of the sides of the merge join, but not the other, 
even though the sides are symmetrical.

Does anyone know why it isn't doing two index scans? Given that the cost 
of the index scan is half that of the alternative, I'm surprised that it 
uses this plan.

I'm using Postgres 8.4.0

Matthew

--
"Interwoven alignment preambles are not allowed."
If you have been so devious as to get this message, you will understand
it, and you deserve no sympathy.  -- Knuth, in the TeXbook

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance