On Sep 14, 2006, at 11:15 AM, Russ Brown wrote:
We recently upgraded our trac backend from sqlite to postgres, and I
decided to have a little fun and write some reports that delve into
trac's subversion cache, and got stuck with a query optimisation
problem.
Table revision contains 2800+ rows
Table node_change contains 370000+.
<...>
I've got stuck with this query:
SELECT author, COUNT(DISTINCT r.rev)
FROM revision AS r
LEFT JOIN node_change AS nc
ON r.rev=nc.rev
WHERE r.time >= EXTRACT(epoch FROM (NOW() - interval '30
days'))::integer
Man I really hate when people store time_t in a database...
GROUP BY r.author;
Statistics are set to 20, and I have ANALYZEd both tables.
The report itself isn't important, but I'm using this as an
exercise in
PostgreSQL query optimisation and planner tuning, so any help/hints
would be appreciated.
Setting statistics higher (100-200), at least for the large table
will likely help. Also make sure that you've set effective_cache_size
correctly (I generally set it to total memory - 1G, assuming the
server has at least 4G in it).
--
Jim Nasby jimn@xxxxxxxxxxxxxxxx
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)