On Tue, Nov 20, 2018 at 05:37:18PM +0100, Duy Nguyen wrote: > > But in (b), we use the number of stored objects, _not_ the allocated > > size of the objects array. So we can run into a situation like this: > > > > 1. packlist_alloc() needs to store the Nth object, so it grows the > > objects array to M, where M > N. > > > > 2. oe_set_tree_depth() wants to store a depth, so it allocates an > > array of length N. Now we've violated our invariant. > > > > 3. packlist_alloc() needs to store the N+1th object. But it _doesn't_ > > grow the objects array, since N <= M still holds. We try to assign > > to tree_depth[N+1], which is out of bounds. > > Do you think if this splitting data to packing_data is too fragile > that we should just scrape the whole thing and move all data back to > object_entry[]? We would use more memory of course but higher memory > usage is still better than more bugs (if these are likely to show up > again). Certainly that thought crossed my mind while working on these patches. :) Especially given the difficulties it introduced into the recent bitmap-reuse topic, and the size fixes we had to deal with in v2.19. Overall, though, I dunno. This fix, while subtle, turned out not to be too complicated. And the memory savings are real. I consider 100M objects to be on the large size of feasible for stock Git these days, and I think we are talking about on the order of 4GB memory savings there. You need a big machine to handle a repository of that size, but 4GB is still appreciable. So I guess at this point, with all (known) bugs fixed, we should stick with it for now. If it becomes a problem for development of a future feature, then we can re-evaluate then. -Peff