Tom Lane wrote:
It's hoping that the backwards scan will hit a row with the requested
file_id quickly; which might be true on average but isn't true for this
particular file_id (nor, presumably, any file_id that hasn't been
updated recently).
Right. In the case of this schema, that is not true; here revision_files
contains the (revision_id, file_id) pair for every file that is
*present* within the given revision, not just the files that were
touched for each revision.
You might consider a two-column index on (file_id, revision_id) to
make this type of query fast.
Interesting. Adding this index seems to bring the query time down to
around 1s:
svnlog=# SELECT revision_id FROM revision_files WHERE file_id=(SELECT
file_id
FROM files WHERE filepath='/trunk/app/widgets/gimptoolbox-dnd.c' LIMIT 1)
ORDER BY revision_id DESC LIMIT 1;
revision_id
-------------
15011
(1 row)
Time: 935.816 ms
However, some more searching came up with this "ORDER BY x + 0"
variation which seems to consistently perform the fastest for varying
flavours of revision_id by forcing use of the file_id index:
svnlog=# SELECT revision_id FROM revision_files WHERE file_id=(SELECT
file_id
FROM files WHERE filepath='/trunk/app/widgets/gimptoolbox-dnd.c' LIMIT 1)
ORDER BY revision_id + 0 DESC LIMIT 1;
revision_id
-------------
15011
(1 row)
Time: 11.446 ms
Ah well. Even though it seems a bit of a kludge, it seems to keep the
application performing as expected so I'll have to stick with it.
ATB,
Mark.
--
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general