Re: [PATCH v2 11/18] maintenance: auto-size incremental-repack batch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/29/2020 6:23 PM, Emily Shaffer wrote:
> On Thu, Jul 23, 2020 at 05:56:33PM +0000, Derrick Stolee via GitGitGadget wrote:
>> diff --git a/builtin/gc.c b/builtin/gc.c
>> index eb4b01c104..889d97afe7 100644
>> --- a/builtin/gc.c
>> +++ b/builtin/gc.c
>> @@ -1021,19 +1021,65 @@ static int multi_pack_index_expire(void)
>>  	return result;
>>  }
>>  
>> +#define TWO_GIGABYTES (2147483647)
> 
> [jonathan tan] This would be easier to understand if it was expressed
> with bitshift, e.g. 1 << 31

This is actually ((1 << 31) - 1) because "unsigned long" is 32-bits
on Windows. But it's better to not use magic numbers and instead use
operations like this.

>> +#define UNSET_BATCH_SIZE ((unsigned long)-1)
> [jonathan tan] This looks like it's never used... and vulnerable to
> cross-platform size changes because it's referring to an implicitly
> sized int, and could behave differently if it was put into a larger
> size, e.g. you wouldn't get 0xFFFF.. if you assigned this into a long
> long.

Thanks for catching this cruft.

>> +	for (p = get_all_packs(r); p; p = p->next) {
>> +		if (p->pack_size > max_size) {
>> +			second_largest_size = max_size;
>> +			max_size = p->pack_size;
>> +		} else if (p->pack_size > second_largest_size)
>> +			second_largest_size = p->pack_size;
>> +	}
>> +
>> +	result_size = second_largest_size + 1;
> [jonathan tan] What happens when there's only one packfile, and when
> there are two packfiles? Can we write tests to illustrate the behavior?
> The edge case here (result_size=1) is hard to understand by reading the
> code.
> 
>> +
>> +	/* But limit ourselves to a batch size of 2g */
> [emily shaffer] nit: can you please capitalize G, GB, whatever :)

I suppose I could (get_unit_factor() performs case-insensitive matches
on the size suffixes), except that would be inconsistent with the
following error message in parse-options.c:

	return error(_("%s expects a non-negative integer value"
		       " with an optional k/m/g suffix"),

or these options in Documentation/git-pack-objects.txt:

--window-memory=<n>::
	This option provides an additional limit on top of `--window`;
	the window size will dynamically scale down so as to not take
	up more than '<n>' bytes in memory.  This is useful in
	repositories with a mix of large and small objects to not run
	out of memory with a large window, but still be able to take
	advantage of the large window for the smaller objects.  The
	size can be suffixed with "k", "m", or "g".
	`--window-memory=0` makes memory usage unlimited.  The default
	is taken from the `pack.windowMemory` configuration variable.

--max-pack-size=<n>::
	In unusual scenarios, you may not be able to create files
	larger than a certain size on your filesystem, and this option
	can be used to tell the command to split the output packfile
	into multiple independent packfiles, each not larger than the
	given size. The size can be suffixed with
	"k", "m", or "g". The minimum size allowed is limited to 1 MiB.
	This option
	prevents the creation of a bitmap index.
	The default is unlimited, unless the config variable
	`pack.packSizeLimit` is set.

Perhaps that's not really a good reason, but upper vs lower case
seems to be arbitrary. Any tie breaker seems valid.

Thanks,
-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux