On Mon, Apr 11, 2011 at 4:11 PM, Lucas Madar <madar@xxxxxxxxxx> wrote: > I have a database that contains many tables, each with some common > characteristics. For legacy reasons, they have to be implemented in a way so > that they are *all* searchable by an older identifier to find the newer > identifier. To do this, we've used table inheritance. > > Each entry has an id, as well as a legacyid1 and legacyid2. There's a master > table that the application uses, containing a base representation and common > characteristics: > > objects ( id, ... ) > item ( id, legacyid1, legacyid2 ) > | - itemXX > | - itemYY > > There is nothing at all in the item table, it's just used for inheritance. > However, weird things happen when this table is joined: > > EXPLAIN ANALYZE SELECT * FROM objects INNER JOIN item f USING ( id ); > > QUERY PLAN > ------------ > Hash Join (cost=457943.85..1185186.17 rows=8643757 width=506) > Hash Cond: (f.id = objects.id) > -> Append (cost=0.00..224458.57 rows=8643757 width=20) > -> Seq Scan on item f (cost=0.00..26.30 rows=1630 width=20) > -> Seq Scan on itemXX f (cost=0.00..1.90 rows=90 width=20) > -> Seq Scan on itemYY f (cost=0.00..7.66 rows=266 width=20) > -> Seq Scan on itemZZ f (cost=0.00..1.02 rows=2 width=20) > ... > -> Hash (cost=158447.49..158447.49 rows=3941949 width=490) > -> Seq Scan on objects (cost=0.00..158447.49 rows=3941949 > width=490) > > This scans everything over everything, and obviously takes forever (there > are millions of rows in the objects table, and tens of thousands in each > itemXX table). > > However, if I disable seqscan (set enable_seqscan=false), I get the > following plan: > > QUERY PLAN > ------------ > Hash Join (cost=10001298843.53..290002337961.71 rows=8643757 width=506) > Hash Cond: (f.id = objects.id) > -> Append (cost=10000000000.00..290000536334.43 rows=8643757 width=20) > -> Seq Scan on item f (cost=10000000000.00..10000000026.30 > rows=1630 width=20) > -> Index Scan using xxx_pkey on itemXX f (cost=0.00..10.60 rows=90 > width=20) > -> Index Scan using yyy_pkey on itemYY f (cost=0.00..25.24 > rows=266 width=20) > -> Index Scan using zzz_pkey on itemZZ f (cost=0.00..9.28 rows=2 > width=20) > ... > -> Hash (cost=999347.17..999347.17 rows=3941949 width=490) > -> Index Scan using objects_pkey on objects (cost=0.00..999347.17 > rows=3941949 width=490) > > This seems like a much more sensible query plan. I don't think so. Scanning the index to extract all the rows in a table is typically going to be a lot slower than a sequential scan. A more interesting question is why you're not getting a plan like this: Nested Loop -> Seq Scan on objects -> Append -> Index Scan using xxx_pkey on itemXX -> Index Scan using yyy_pkey on itemYY -> Index Scan using zzz_pkey on itemZZ > But it seems to think doing > a sequential scan on the *empty* item table is excessively expensive in this > case. > > Aside from enable_seqscan=false, is there any way I can make the query > planner not balk over doing a seqscan on an empty table? Why would you care? A sequential scan of an empty table is very fast. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance