On 7/29/2020 6:23 PM, Emily Shaffer wrote: > On Thu, Jul 23, 2020 at 05:56:33PM +0000, Derrick Stolee via GitGitGadget wrote: >> diff --git a/builtin/gc.c b/builtin/gc.c >> index eb4b01c104..889d97afe7 100644 >> --- a/builtin/gc.c >> +++ b/builtin/gc.c >> @@ -1021,19 +1021,65 @@ static int multi_pack_index_expire(void) >> return result; >> } >> >> +#define TWO_GIGABYTES (2147483647) > > [jonathan tan] This would be easier to understand if it was expressed > with bitshift, e.g. 1 << 31 This is actually ((1 << 31) - 1) because "unsigned long" is 32-bits on Windows. But it's better to not use magic numbers and instead use operations like this. >> +#define UNSET_BATCH_SIZE ((unsigned long)-1) > [jonathan tan] This looks like it's never used... and vulnerable to > cross-platform size changes because it's referring to an implicitly > sized int, and could behave differently if it was put into a larger > size, e.g. you wouldn't get 0xFFFF.. if you assigned this into a long > long. Thanks for catching this cruft. >> + for (p = get_all_packs(r); p; p = p->next) { >> + if (p->pack_size > max_size) { >> + second_largest_size = max_size; >> + max_size = p->pack_size; >> + } else if (p->pack_size > second_largest_size) >> + second_largest_size = p->pack_size; >> + } >> + >> + result_size = second_largest_size + 1; > [jonathan tan] What happens when there's only one packfile, and when > there are two packfiles? Can we write tests to illustrate the behavior? > The edge case here (result_size=1) is hard to understand by reading the > code. > >> + >> + /* But limit ourselves to a batch size of 2g */ > [emily shaffer] nit: can you please capitalize G, GB, whatever :) I suppose I could (get_unit_factor() performs case-insensitive matches on the size suffixes), except that would be inconsistent with the following error message in parse-options.c: return error(_("%s expects a non-negative integer value" " with an optional k/m/g suffix"), or these options in Documentation/git-pack-objects.txt: --window-memory=<n>:: This option provides an additional limit on top of `--window`; the window size will dynamically scale down so as to not take up more than '<n>' bytes in memory. This is useful in repositories with a mix of large and small objects to not run out of memory with a large window, but still be able to take advantage of the large window for the smaller objects. The size can be suffixed with "k", "m", or "g". `--window-memory=0` makes memory usage unlimited. The default is taken from the `pack.windowMemory` configuration variable. --max-pack-size=<n>:: In unusual scenarios, you may not be able to create files larger than a certain size on your filesystem, and this option can be used to tell the command to split the output packfile into multiple independent packfiles, each not larger than the given size. The size can be suffixed with "k", "m", or "g". The minimum size allowed is limited to 1 MiB. This option prevents the creation of a bitmap index. The default is unlimited, unless the config variable `pack.packSizeLimit` is set. Perhaps that's not really a good reason, but upper vs lower case seems to be arbitrary. Any tie breaker seems valid. Thanks, -Stolee