Zack Weinberg <zackw@xxxxxxxxx> writes: > For each (experiment_id, url_id) pair for some small subset of the > experiment_ids, I need to query the full_url_id corresponding to the > *largest* value of redirect_num. The query planner does something > reasonable with this SELECT: > => explain (analyze, verbose) > select b.experiment_id, b.url_id, b.full_url_id > from blockpage b, > (select experiment_id, url_id, max(redirect_num) as redirect_num > from blockpage group by experiment_id, url_id) bm > where b.experiment_id = bm.experiment_id > and b.url_id = bm.url_id > and b.redirect_num = bm.redirect_num > and bm.experiment_id in (16, 43); With that query, the "bm.experiment_id in (16, 43)" restriction is pushed into the "bm" sub-select, greatly reducing the amount of work the GROUP BY step has to do. > But if I change the final part of the WHERE to reference > b.experiment_id instead of bm.experiment_id, I get this much more > expensive query plan: > => explain (analyze, verbose) > select b.experiment_id, b.url_id, b.full_url_id > from blockpage b, > (select experiment_id, url_id, max(redirect_num) as redirect_num > from blockpage group by experiment_id, url_id) bm > where b.experiment_id = bm.experiment_id > and b.url_id = bm.url_id > and b.redirect_num = bm.redirect_num > and b.experiment_id in (16, 43); With that query, the GROUP BY is evaluated in full, and it costs you. In principle, given the nearby "where b.experiment_id = bm.experiment_id" clause, we could derive "bm.experiment_id in (16, 43)" from the stated clause. But we don't. The existing machinery around derivation of implied equalities only works for simple equalities, not OR clauses. Extending that would be a bit of a research project, and it's far from clear that the benefits would be worth the additional planning costs. > What is the best way to report this to the developers? Should I file > a bug report? I'm using Postgres 12.2. This is not a bug, and you should not hold your breath waiting for it to change. regards, tom lane