Re: [PATCH 16/20] builtin/gc.c: guess the size of the revindex

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/11/2021 11:23 AM, Taylor Blau wrote:
> On Mon, Jan 11, 2021 at 06:52:24AM -0500, Derrick Stolee wrote:
>> This is so far the only not-completely-obvious change.
>>
>>> But, this is an approximation anyway, and it does remove a use of the
>>> 'struct revindex_entry' from outside of pack-revindex internals.
>>
>> And this might be enough justification for it, but...
>>
>>> -	heap += sizeof(struct revindex_entry) * nr_objects;
>>> +	heap += (sizeof(off_t) + sizeof(uint32_t)) * nr_objects;
>>
>> ...outside of the estimation change, will this need another change
>> when the rev-index is mmap'd? Should this instead be an API call,
>> such as estimate_rev_index_memory(nr_objects)? That would
>> centralize the estimate to be next to the code that currently
>> interacts with 'struct revindex_entry' and will later interact with
>> the mmap region.
> 
> I definitely did consider this, and it seems that I made a mistake in
> not documenting my consideration (since I assumed that it was so benign
> nobody would notice / care ;-)).
> 
> The reason I didn't pursue it here was that we haven't yet loaded the
> reverse index by this point. So, you'd want a function that at least
> stats the '*.rev' file (and either does or doesn't parse it [1]), or
> aborts early to indicate otherwise.

In this patch, I would expect it to use sizeof(struct revindex_entry).
Later, the method would know if a .rev file exists and do the right
thing instead. (Also, should mmap'd data count towards this estimate?)

> One would hope that 'load_pack_revindex()' would do just that, but it
> falls back to load a reverse index in memory, which involves exactly the
> slow sort that we're trying to avoid. (Of course, we're going to have to
> do it later anyway, but allocating many GB of heap just to provide an
> estimation seems ill-advised to me ;-).)
> 
> So, we'd have to expand the API in some way or another, and to me it
> didn't seem worth it. As I mentioned in the commit message, I'm
> skeptical of the value of being accurate here, since this is (after all)
> an estimation.

Yes, I'm probably just poking somewhere it was easy to poke. This is
probably not worth the time I'm spending asking about it.

Feel free to disregard.

-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux