On Mon, Jun 27, 2022 at 11:12:09AM -0400, Derrick Stolee wrote: > On 6/26/2022 9:10 AM, Abhradeep Chakraborty via GitGitGadget wrote: > > + table_size = st_add(table_size, > > + st_mult(ntohl(header->entry_count), > > + triplet_sz)); > > Here, we _do_ want to keep the st_mult(). Is the st_add() still necessary? It > seems this is a leftover from the previous version that had the 4-byte flag > data. > > We set table_size to zero above. We could drop that initialization and instead > have this after the "size_t triplet_sz" definition: > > size_t table_size = st_mult(ntohl(header->entry_count), > triplet_sz)); Well put, thank you. > > + if (table_size > index_end - index->map - header_size) > > + return error("corrupted bitmap index file (too short to fit lookup table)"); > > Please add "_(...)" around the error message so it can be translated. I missed this in my own review, but yes: this is a good practice. > > + if (bitmap_git->midx) > > + found = bsearch_midx(oid, bitmap_git->midx, result); > > + else > > + found = bsearch_pack(oid, bitmap_git->pack, result); > > + > > + return found; > > Here, we are doing a binary search on the entire list of packed objects, which could > use quite a few more hops than a binary search on the bitmapped commits. I think this is the best we can do if we make the key to our bsearch through the lookup table be an index into the pack index / MIDX. But... > > +static struct stored_bitmap *lazy_bitmap_for_commit(struct bitmap_index *bitmap_git, > > + struct commit *commit) > ... > > + int found = bsearch_pos(bitmap_git, oid, &commit_pos); > > + > > + if (!found) > > + return NULL; > > + > > + triplet = bsearch(&commit_pos, bitmap_git->table_lookup, bitmap_git->entry_count, > > + triplet_sz, triplet_cmp); > > But I see, you are searching the pack-index for the position in the index, and _then_ > searching the bitmap lookup table based on that position value. > > I expected something different: binary search on the triplets where the comparison is > made by looking up the OID from the [multi-]pack-index and comparing that OID to the > commit OID we are looking for. > > I'm not convinced that the binary search I had in mind is meaningfully faster than > what you've implemented here, so I'm happy to leave it as you have it. We can investigate > if that full search on the pack-index matters at all (it probably doesn't). ...exactly my thoughts, too. It's possible that it would be faster to key this search on the object_id "oid" above, and then convert each of the entries in the lookup table from a uint32_t into an object_id by calling nth_bitmap_object_oid() repeatedly. I *think* that what Abhradeep wrote here is going to be faster more often than not since it makes more efficient use of the page cache rather than switching between reads across different memory mapped regions at each point in the binary search. But of course that depends on a number of factors. Abhradeep: if you're up for it, I think it would be worth trying it both ways and seeing if one produces a meaningful speed-up or slow-down over the other. Like I said: my guess is that what you have now will be faster, but I don't have a clear sense that that is true without trying it both ways ;-). > > + if (!triplet) > > + return NULL; > > + > > + offset = triplet_get_offset(triplet); > > + xor_pos = triplet_get_xor_pos(triplet); > > + > > + if (xor_pos != 0xffffffff) { > > + int xor_flags; > > + uint64_t offset_xor; > > + uint32_t *xor_positions; > > + struct object_id xor_oid; > > + size_t size = 0; > > + > > + ALLOC_ARRAY(xor_positions, bitmap_git->entry_count); > > While there is potential that this is wasteful, it's probably not that huge, > so we can start with the "maximum XOR depth" and then reconsider a smaller > allocation in the future. There is no maximum XOR depth, to my knowledge. We do have a maximum XOR *offset*, which says we cannot XOR-compress a bitmap with an entry more than 160 entries away from the current one. But in theory every commit could be XOR compressed with the one immediately proceeding it, so the maximum depth could be as long as the entry_count itself. I think starting off with a small array and then letting it grow according to alloc_nr() would be fine here, since it will grow more and more each time, so the amount of times we have to reallocate the buffer will tail off over time. If we were really concerned about it, we could treat the buffer as a static pointer and reuse it over time (making sure to clear out the portions of it we're going to reuse, or otherwise ensuring that we don't read old data). But I doubt it matters much either way in practice: the individual records are small (at just 4 bytes each) and entry_count is often less than 1,000, so I think this probably has a vanishingly small impact. Thanks, Taylor