[PostgreSQL 11.5 (Ubuntu 11.5-1.pgdg18.04+1) on x86_64-pc-linux-gnu] I have a table with many columns and many indexes (actually many tables with many columns and many indexes), but for the sake of this posting, we'll consider just three columns, which we unimaginatively call a, b, and c. There are also three indexes: t_a_idx btree (a) WHERE a IS NOT NULL t_b_idx btree (b) WHERE b IS NOT NULL t_a_b_idx btree (a, b) WHERE a IS NOT NULL AND b IS NOT NULL Nowe I have a query select c from t where a='A' and b='B'; This uses t_b_idx, not - as I expected - t_a_b_idx. If I create an index with the columns swapped: t_b_a_idx btree (b, a) WHERE b IS NOT NULL and a IS NOT NULL this index will be used. The distribution of values in columns a and b is quite different: a has 10 different values of similar frequency (and no null values). b has only a single non-null value which with a frequency of about 1 %. So I definitely understand why it would prefer t_b_idx to t_a_idx, but certainly t_a_b_idx should be even better? After all it would have to read only 1/1000 of the rows instead of 1/100. it would also have to scan much less of the index, so the fact the fact that the index is a bit larger shouldn't make a difference. Explain shows that the row estimates are spot on, but the cost for using t_a_b_idx is higher than for t_b_idx (which is in turn higher than for t_b_a_idx). hp -- _ | Peter J. Holzer | we build much bigger, better disasters now |_|_) | | because we have much more sophisticated | | | hjp@xxxxxx | management tools. __/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
Attachment:
signature.asc
Description: PGP signature