On Thu, 2006-09-21 at 23:39 -0400, Jim Nasby wrote: > On Sep 14, 2006, at 11:15 AM, Russ Brown wrote: > > We recently upgraded our trac backend from sqlite to postgres, and I > > decided to have a little fun and write some reports that delve into > > trac's subversion cache, and got stuck with a query optimisation > > problem. > > > > Table revision contains 2800+ rows > > Table node_change contains 370000+. > <...> > > I've got stuck with this query: > > > > SELECT author, COUNT(DISTINCT r.rev) > > FROM revision AS r > > LEFT JOIN node_change AS nc > > ON r.rev=nc.rev > > WHERE r.time >= EXTRACT(epoch FROM (NOW() - interval '30 > > days'))::integer > > Man I really hate when people store time_t in a database... > I know. Probably something to do with database engine independence. I don't know if sqlite even has a date type (probably does, but I haven't checked). > > GROUP BY r.author; > > > > Statistics are set to 20, and I have ANALYZEd both tables. > > > > The report itself isn't important, but I'm using this as an > > exercise in > > PostgreSQL query optimisation and planner tuning, so any help/hints > > would be appreciated. > > Setting statistics higher (100-200), at least for the large table > will likely help. Also make sure that you've set effective_cache_size > correctly (I generally set it to total memory - 1G, assuming the > server has at least 4G in it). Thank you: the problem was the effective_cache_size (which I hadn't changed from the default of 1000). This machine doesn't have loads of RAM, but I knocked it up to 65536 and now the query uses the index, without having to change the statistics. Thanks a lot! > -- > Jim Nasby jimn@xxxxxxxxxxxxxxxx > EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) > >