Darren, * Darren Lafreniere (dlafreniere@xxxxxxxxxxx) wrote: > Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > > > Gavin Wahl wrote: > > >> It seems trivial to accelerate a MAX or MIN query with a BRIN index. You > > >> just find the page range with the largest/smallest value, and then only > > >> scan that one. Would that be hard to implement? I'm interested in > > working > > >> on it if someone can give me some pointers. > > > > I think this proposal is fairly broken anyway. The page range with the > > largest max-value may once have contained the largest live row, but > > there's no guarantee that it still does. It might even be completely > > empty. You could imagine an algorithm like this: > > > > 1. Find page-range with largest max. Scan it to identify live row with > > largest value. If *no* live values, find page-range with next largest > > max, repeat until no page ranges remain (whereupon return NULL). > > > > 2. For each remaining page-range whose indexed max exceeds the value > > currently in hand, scan that page-range to see if any value exceeds > > the one in hand, replacing the value if so. > > > > This'd probably allow you to omit scanning some of the page-ranges > > in the table, but in a lot of cases you'd end up scanning many of them; > > and you'd need a lot of working state to remember which ranges you'd > > already looked at. It'd certainly always be a lot more expensive than > > answering the same question with a btree index, because in no case do > > you get to avoid scanning the entire contents of the index. [...] > A b-tree index would certainly be faster for ordering. But in scenarios > where you have huge datasets that can't afford the space or update time > required for b-tree, could such a BRIN-accelerated ordering algorithm at > least be faster than ordering with no index? For at least some of the common BRIN use-cases, where the rows are inserted in-order and never/very-rarely modified or deleted, this approach would work very well. Certainly, using this would be much cheaper than a seqscan/top-N sort, for small values of 'N', relative to the number of rows in the table, in those cases. In general, I like the idea of supporting this as BRIN indexes strike me as very good for very large tables which have highly clumped data in them and being able to do a top-N query on those can be very useful at times. Thanks! Stephen
Attachment:
signature.asc
Description: Digital signature