>As an example, we're dealing with millions of rows where we often want to find or summarize by a category value. So, maybe 6-10 categories that are used in various queries. It's not realistic for us to anticipate every field combination
the category field is going to be involved in to lay down multi-column indexes everywhere. Apologies if I’ve missed this somewhere else in the thread, but I’ve not seen anyone suggest that bloom indexes[1] be thrown into the mix. Depending on your use case, you might be able to replace many multi-column btree indexes with a single bloom index, optimizing its size vs. performance using the “length” parameter. You could even reduce the number of bits generated for
low cardinality columns to 1, which should reduce the number of false positives that are later removed by a condition recheck. Maybe there’s a good reason for their omission, but I’d like to learn this I’m completely off the mark! Steve. |