On Wed, May 16, 2018 at 01:40:56PM -0600, Martin Fick wrote: > > In theory the fetch means that it's safe to actually prune > > in the mother repo, but in practice there are still > > races. They don't come up often, but if you have enough > > repositories, they do eventually. :) > > Peff, > > I would be very curious to hear what you think of this > approach to mitigating the effect of those races? > > https://git.eclipse.org/r/c/122288/2 The crux of the problem is that we have no way to atomically mark an object as "I am using this -- do not delete" with respect to the actual deletion. So if I'm reading your approach correctly, you put objects into a purgatory rather than delete them, and let some operations rescue them from purgatory if we had a race. That's certainly a direction we've considered, but I think there are some open questions, like: 1. When do you rescue from purgatory? Any time the object is referenced? Do you then pull in all of its reachable objects too? 2. How do you decide when to drop an object from purgatory? And specifically, how do you avoid racing with somebody using the object as you're pruning purgatory? 3. How do you know that an operation has been run that will actually rescue the object, as opposed to silently having a corrupted state on disk? E.g., imagine this sequence: a. git-prune computes reachability and finds that commit X is ready to be pruned b. another process sees that commit X exists and builds a commit that references it as a parent c. git-prune drops the object into purgatory Now we have a corrupt state created by the process in (b), since we have a reachable object in purgatory. But what if nobody goes back and tries to read those commits in the meantime? I think this might be solvable by using the purgatory as a kind of "lock", where prune does something like: 1. compute reachability 2. move candidate objects into purgatory; nobody can look into purgatory except us 3. compute reachability _again_, making sure that no purgatory objects are used (if so, rollback the deletion and try again) But even that's not quite there, because you need to have some consistent atomic view of what's "used". Just checking refs isn't enough, because some other process may be planning to reference a purgatory object but not yet have updated the ref. So you need some atomic way of saying "I am interested in using this object". -Peff