On Tue, Jun 28, 2022 at 02:29:50PM +0530, Abhradeep Chakraborty wrote: > Taylor Blau <me@xxxxxxxxxxxx> wrote: > > > ...exactly my thoughts, too. It's possible that it would be faster to > > key this search on the object_id "oid" above, and then convert each of > > the entries in the lookup table from a uint32_t into an object_id by > > calling nth_bitmap_object_oid() repeatedly. > > > > I *think* that what Abhradeep wrote here is going to be faster more > > often than not since it makes more efficient use of the page cache > > rather than switching between reads across different memory mapped > > regions at each point in the binary search. > > > > But of course that depends on a number of factors. Abhradeep: if you're > > up for it, I think it would be worth trying it both ways and seeing if > > one produces a meaningful speed-up or slow-down over the other. Like I > > said: my guess is that what you have now will be faster, but I don't > > have a clear sense that that is true without trying it both ways ;-). > > Ok. Let me try both the ways. In my opinion, I think my version has > less searching and less computation. So, I want to stick with this > version. But I also like to try the other one once so that we can > get the best out of these two. Yeah, I agree with your general sense that the version as written is going to be faster. We're comparing a smaller datatype (IOW, a 4-byte integer that can be checked for equality in a single instruction, instead of comparing two 20-byte OIDs), and likely flushing the cache far less often. But having two concrete implementations to compare will help us know for a fact that our intuition is correct. I'll be curious to see what you find here! > > I think starting off with a small array and then letting it grow > > according to alloc_nr() would be fine here, since it will grow more and > > more each time, so the amount of times we have to reallocate the buffer > > will tail off over time. > > What should be the size of that array? I think some small, power of 2 would be a reasonable choice here. > > If we were really concerned about it, we could treat the buffer as a > > static pointer and reuse it over time (making sure to clear out the > > portions of it we're going to reuse, or otherwise ensuring that we don't > > read old data). But I doubt it matters much either way in practice: the > > individual records are small (at just 4 bytes each) and entry_count is > > often less than 1,000, so I think this probably has a vanishingly small > > impact. > > Before submitting it to the mailing list, I did use the ALLOC_GROW macro > function. But my version was worse than yours. For every iteration I was > reallocating the array to support `size+1` positions. But later I drop > the code as this might be very much expensive. That shouldn't be the case. When you have a chance, take a look at the alloc_nr macro, which shows how much memory we allocate at each step: #define alloc_nr(x) (((x)+16)*3/2) Suppose we allocated 16 slots initially, so nr (the number of entries stored in the list) is 0 and alloc (the number of entries allocated) is 16. Then when we try to add the 17th item, we'll pass 16 to alloc_nr which will allocate 48 slots. Then 96, then 168, and so on. We only have to reallocate and copy the array when nr > alloc, which should be fairly infrequently, and happens less and less often the larger the array grows. Thanks, Taylor