On Wed, May 16, 2018 at 03:01:13PM -0400, Konstantin Ryabitsev wrote: > On 05/16/18 14:26, Martin Fick wrote: > > If you are going to keep the unreferenced objects around > > forever, it might be better to keep them around in packed > > form? > > I'm undecided about that. On the one hand this does create lots of small > files and inevitably causes (some) performance degradation. On the other > hand, I don't want to keep useless objects in the pack, because that > would also cause performance degradation for people cloning the "mother > repo." If my assumptions on any of that are incorrect, I'm happy to > learn more. I implemented "repack -k", which keeps all objects and just rolls them into the new pack (along with any currently-loose unreachable objects). Aside from corner cases (e.g., where somebody accidentally added a 20GB file to an otherwise 100MB-repo and then rolled it back), it usually doesn't significantly affect the repository size. And it generally should not cause performance problems for people cloning, since Git will create a custom pack for each client with only the reachable objects. There _is_ an interesting corner case where a reachable object might be a delta against an unreachable one, which can cause a clone to have to break that relationship and find a new delta. At GitHub we have some custom code that tries to avoid these kind of delta dependencies (not just to unreachable objects, but to other forks that share object storage). You can see the patch at: https://github.com/peff/git jk/delta-islands -Peff