On Thu, 3 Jan 2019 at 05:28, Abadie Lana <Lana.Abadie@xxxxxxxx> wrote: > I would appreciate any hints as this problem looks to me rather strange…I tried to google it but in vain. > > select t.name, c.smpl_time,c.nanosecs,c.float_val,c.num_val,c.str_val,c.datatype,c.array_val from sample c, channel t where t.channel_id=c.channel_id and t.name='BUIL-B36-VA-RT-RT1:CL0001-2-ABW' order by c.smpl_time desc limit 5; > > takes 20mn to execute because it picks up the wrong index…see explain analyse below. I would expect this query to use the (channel_id,smpl_time) but it uses the smpl_time index. [...] > Any ideas, why the planner is not taking the right index? The planner assumes that the required channel values are evenly distributed through the scan of the index on smpl_time. If your required 5 rows were found quickly (i.e channels with recent sample values), then the plan would have worked out well. It looks like 'BUIL-B36-VA-RT-RT1:CL0001-2-ABW' is probably a channel which has some very old sample values. I can see that from "Rows Removed by Join Filter: 322099471", meaning that on backwards scanning the smpl_time index, that many rows were found not to match the channel you requested. The planner, by default only has statistics to say how common each channel is in the sample table. I think in this case since the planner has no knowledge of which channel_id it will be searching for (that's only determined during execution), then I suppose it must be using the n_distinct of the sample.channel_id table. It would be interesting to know how far off the n_distinct estimation is. You can find out with: select stadistinct from pg_statistic where starelid='sample'::regclass and staattnum = 1; select count(*) from (select distinct channel_id from sample) s; -- this may take a while to run... If the stadistinct estimate is far out from the reality, then you could consider setting this manually with: alter table sample alter column channel_id set (n_distinct = <actual value here>); but keep in mind, that as the table evolves, whatever you set there could become outdated. Another method to fix you could try would be to coax the planner into doing something different would be to give it a better index to work with. create index on channel(name, channel_id); You didn't show us the details from the channel table, but if there's not an index like this then this might reduce the cost of a Merge Join, but since the order rows output from that join would be in channel_id order, a Sort would be required, which would require joining all matching rows, not just the first 5 matches. Depending on how many rows actually match will determine if that's faster or not. If you don't have luck with either of the above then, one other thing you could try would be to disallow the planner from using the smpl_time index by changing the order by to "ORDER BY c.smpl_time + INTERVAL '0 sec'; that's a bit of a hack, but we don't have anything we officially call "query hints" in PostgreSQL, so often we're left to solve issues like this with ugly tricks like that. Also, going by: > -> Seq Scan on channel t (cost=0.00..915.83 rows=1 width=41) (actual time=4.683..7.885 rows=1 loops=1) perhaps "name" is unique on the channel table? (I doubt there's an index/constraint to back that up, however, since such an index would have likely been used here instead of the Seq Scan) If so, and you can add a constraint to back that up, you might be able to reform the query to be: select 'BUIL-B36-VA-RT-RT1:CL0001-2-ABW', c.smpl_time,c.nanosecs,c.float_val,c.num_val,c.str_val,c.datatype,c.array_val from sample c WHERE c.channel_id = (SELECT channel_id FROM channel WHERE name='BUIL-B36-VA-RT-RT1:CL0001-2-ABW') order by c.smpl_time desc limit 5; If you can do that then it's highly likely to be *very* fast to execute since I see there's an index on (channel_id, smpl_time) on each of the inherited tables. (If our planner was smarter then in the presence of the correct unique index, we could have rewritten the query as such automatically.... but it's not / we don't. I believe I've mentioned about improving this somewhere in the distant past of the -hackers mailing list, although I can't find it right now. I recall naming the idea "scalar value lookup joins", but development didn't get much beyond thinking of that name) -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services