On Fri, Feb 24, 2023 at 06:51:00PM +0100, Mickael van der Beek wrote: > Hello Justin, > > Thanks for the quick response! > > > The table may be dense, but the tuples aren't. You're asking to return > > 1/1000th of the tuples, across the entire table. Suppose there are ~100 > > tuples per page, and you need to read about every 10th page. It makes > > sense that it's slow to read a large amount of data nonsequentially. > > Ah, of course, you're right! > I forgot that the BRIN indexes store ranges that are not fully covered by > the row values and that PostgreSQL has to double-check (bitmap heap scan) > ... > Would you thus advise to only use BRIN indexes for columns who's values are > (1) monotonically increasing but also (2) close to each other? It's not important whether they're "rigidly" monotonic (nor "strictly"). What's important is that a query doesn't need to access a large number of pages. For example, some of the BRIN indexes that I'm familiar with are created on a column called "start time", but the table's data tends to be naturally sorted by "end time" - and that's good enough. If someone queries for data between 12pm and 1pm, there's surely no data for the first 12 hours of the day's table (because it hadn't happened yet) and there's probably no data for the last 9+ hours of the day, either, so it's only got to read data for a 1-2h interval in the middle. This assumes that the column's data is typically correlated. If the tuples aren't clustered/"close to each other" then it probably doesn't work well. I haven't played with brin "multi minmax", though. > > > 2. Since we only select the "idx" column, why does the BRIN index not > > > simply return the searched value if included in one of it's ranges? > > > Hitting the actual row data stored in the table seems to be unnessary no? > > > > Because it's necessary to check if the tuple is visible to the current > > transaction. It might be from an uncommited/aborted transaction. Actually, a better explanation is that all the brin scan returns is the page, and not the tuples. "BRIN indexes can satisfy queries via regular bitmap index scans, and will return all tuples in all pages within each range if the summary info stored by the index is CONSISTENT with the query conditions. The query executor is in charge of rechecking these tuples and discarding those that do not match the query conditions — in other words, these indexes are LOSSY". The index is returning pages where matching tuples *might* be found, after excluding those pages where it's certain that no tuples are found. -- Justin