On Tue, Nov 30, 2021 at 3:12 AM Jeff King <peff@xxxxxxxx> wrote: > We set transfer.unpackLimit to "1", so we never run unpack-objects at > all. We always run index-pack, and every push, no matter how small, > results in a pack. > > We also set GIT_ALLOC_LIMIT to limit any single allocation. We also have > custom code in index-pack to detect large objects (where our definition > of "large" is 100MB by default): > > - for large blobs, we do index it as normal, writing the oid out to a > file which is then processed by a pre-receive hook (since people > often push up large files accidentally, the hook generates a nice > error message, including finding the path at which the blob is > referenced) > > - for other large objects, we die immediately (with an error message). > 100MB commit messages aren't a common user error, and it closes off > a whole set of possible integer-overflow parsing attacks (e.g., > index-pack in strict-mode will run every tree through fsck_tree(), > so there's otherwise nothing stopping you from having a 4GB filename > in a tree). Thank you very much for sharing. The way Github handles it reminds me of what Shawn Pearce introduced in "Scaling up JGit". I guess "mulit-pack-index" and "bitmap" must play an important role in this. I will seriously consider this solution, thanks a lot.