Re: PostgreSQL 8.3.3 chooses wrong query plan when LIMIT 1 added?

Mark Cave-Ayland <mark.cave-ayland@xxxxxxxxxxxxxx> · Mon, 27 Oct 2008 22:24:30 +0000

Tom Lane wrote:

It's hoping that the backwards scan will hit a row with the requested
file_id quickly; which might be true on average but isn't true for this
particular file_id (nor, presumably, any file_id that hasn't been
updated recently).

Right. In the case of this schema, that is not true; here revision_files 
contains the (revision_id, file_id) pair for every file that is 
*present* within the given revision, not just the files that were 
touched for each revision.

You might consider a two-column index on (file_id, revision_id) to
make this type of query fast.

Interesting. Adding this index seems to bring the query time down to 
around 1s:

svnlog=# SELECT revision_id FROM revision_files WHERE file_id=(SELECT 
file_id
FROM files WHERE filepath='/trunk/app/widgets/gimptoolbox-dnd.c' LIMIT 1)
ORDER BY revision_id DESC LIMIT 1;
 revision_id
-------------
       15011
(1 row)

Time: 935.816 ms

However, some more searching came up with this "ORDER BY x + 0" 
variation which seems to consistently perform the fastest for varying 
flavours of revision_id by forcing use of the file_id index:

svnlog=# SELECT revision_id FROM revision_files WHERE file_id=(SELECT 
file_id
FROM files WHERE filepath='/trunk/app/widgets/gimptoolbox-dnd.c' LIMIT 1)
ORDER BY revision_id + 0 DESC LIMIT 1;
 revision_id
-------------
       15011
(1 row)

Time: 11.446 ms

Ah well. Even though it seems a bit of a kludge, it seems to keep the 
application performing as expected so I'll have to stick with it.

ATB,

Mark.

--
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general