On 05/16/18 15:37, Jeff King wrote: > Yes, that's pretty close to what we do at GitHub. Before doing any > repacking in the mother repo, we actually do the equivalent of: > > git fetch --prune ../$id.git +refs/*:refs/remotes/$id/* > git repack -Adl > > from each child to pick up any new objects to de-duplicate (our "mother" > repos are not real repos at all, but just big shared-object stores). Yes, I keep thinking of doing the same, too -- instead of using torvalds/linux.git for alternates, have an internal repo where objects from all forks are stored. This conversation may finally give me the shove I've been needing to poke at this. :) Is your delta-islands patch heading into upstream, or is that something that's going to remain external? > I say "equivalent" because those commands can actually be a bit slow. So > we do some hacky tricks like directly moving objects in the filesystem. > > In theory the fetch means that it's safe to actually prune in the mother > repo, but in practice there are still races. They don't come up often, > but if you have enough repositories, they do eventually. :) I feel like a whitepaper on "how we deal with bajillions of forks at GitHub" would be nice. :) I was previously told that it's unlikely such paper could be written due to so many custom-built things at GH, but I would be very happy if that turned out not to be the case. Best, -- Konstantin Ryabitsev Director, IT Infrastructure Security The Linux Foundation
Attachment:
signature.asc
Description: OpenPGP digital signature