* Tom Lane (tgl@xxxxxxxxxxxxx) wrote: > Yeah, it's spending quite a lot of time finding the first matching row > in each child table. I'm curious why that is though; are the child > tables not set up with nonoverlapping firstloadtime ranges? They are set up w/ nonoverlapping firstloadtime ranges, using CHECK constraints such as: "osmoduleloads_2011_09_14_firstloadtime_check" CHECK (firstloadtime >= 129604464000000000::bigint::numeric AND firstloadtime < 129605328000000000::bigint::numeric) The issue here is that the query is saying "Give me the first 150 records with this host_id in this week-long range". PG happily eliminates all the tables that are outside of the week-long range during constraint exclusion. After that, however, it hunts down the earliest records (which matches 'host_id') from each child table. Sure, from each table there's a record in the week-long range with the host_id that matches. What PG doesn't realize is that it can stop after pulling the 150 records from the most recent table (and flipping the direction of the query or the tables doesn't help- PG still pulls a record from each table). > > What would be great is if PG would realize that the CHECK constraints > > prevent earlier records from being in these earlier tables, > > The explain shows that that isn't the case, because it *is* finding at > least one candidate row in each table. It's just running quite far into > the firstloadtime sequence to do it. My point above is that the CHECK constraints ensure an ordering which could be leveraged to use the latest table first and then stop if enough tuples are returned (or immediately go to the next table), without ever considering the other tables. I'm not looking for PG to eliminate those other tables for consideration in all cases- if the limit is large enough, it may get all the way down to them. I'm pretty sure this isn't something which PG does today and I don't expect teaching it to do this to be trivial, but it certainly would be nice as this strikes me as a very common use-case. > If you're stuck with this table arrangement, one thing that would help > is a two-column index on (host_guid, firstloadtime) on each child table. Agreed, I mentioned this to the OP previously and it's on his list of things to try. Thanks, Stephen
Attachment:
signature.asc
Description: Digital signature