On Fri, May 10, 2019 at 01:20:55AM +0200, Ævar Arnfjörð Bjarmason wrote: > > Michael Haggerty and I have (off-list) discussed variations on that, but > > it opens up a lot of new issues. Moving something into quarantine isn't > > atomic. So you've still corrupted the repo, but now it's recoverable by > > reaching into the quarantine. Who notices that the repo is corrupt, and > > how? When do we expire objects from quarantine? > > > > I think the heart of the issue is really the lack of atomicity in the > > operations. You need some way to mark "I am using this now" in a way > > that cannot race with "looks like nobody is using this, so I'll delete > > it". > > > > And ideally without traversing large bits of the graph on the writing > > side, and without requiring any stop-the-world locks during pruning. > > I was thinking (but realize now that I didn't articulate) that the "gc > quarantine" would be another "alternate" implementing a copy-on-write > "lockless delete-but-be-able-to-rollback scheme" as you put it. > > So "gc" would decide (racily) what's unreachable, but instead of > unlink()-ing it would "mv" the loose object/pack into the > "unreferenced-objects" quarantine. > > Then in your example #1 "wants to reference ABCD. It sees that we have > it." would race on the "other side". I.e. maybe ABCD was *just* moved to > the quarantine. But in that case we'd move it back, which would bump the > mtime and thus make it ineligible for expiry. I think this is basically the same as the current freshening scheme, though. In general, you can replace "move it back" with "update its mtime". Neither is atomic with respect to other operations. It does seem like the twist is that "gc" is supposed to do the "move it back" step (and it's also the thing expiring, if we assume that there's only one gc running at a time). But again, how do we know somebody isn't referencing it _right now_ while we're deciding whether to move it back? I think there are lots of solutions you can come up with if you have atomicity. But fundamentally it isn't there in the way we handle updates now. You could imagine something like a shared/unique lock where anybody updating a ref takes the "shared" side, and multiple entities can hold it at once. But somebody pruning takes the "unique" side and excludes everybody else, stopping ref updates during the prune (which you'd obviously want to do in a way that you hold the lock for as short as possible; say, optimistically check reachability without the lock, then take the lock and check to see if anything has changed). (By shared/unique I basically mean a reader/writer lock, but I didn't want to use those terms in the paragraph since both holders are writing). It is tricky to find out when to hold the shared lock, though. It's _not_ just a ref write, for example. When you accept a push, you'd want to hold the lock while you are checking that you have all of the necessary objects to write the ref. For something like "git commit" it's even harder, because we implicitly rely on state created by commands run over the course of hours or days (e.g., "git add" to put a blob in the index and maybe create the tree via cache-tree, then a commit to reference it, and finally the ref write; each step adds state which the next step relies on). > Aside from that, I have a hunch that while it's theoretically true that > you can at any time re-reference some loose blob/tree/commit again, that > the likelyhood of that in practice goes down as it ages, since a user is > likely to e.g. re-push or rename some branch they pushed last week, not > last year. > > Hence the mention of creating "unreferenced packs" with some new > --keep-unreachable mode. Since we'd pack those together they wouldn't > create the "ref explosion" problem we have with the loose refs, and thus > you could afford to keep them longer (even though the deltas would be > shittier). Yeah, that may make it less likely (and we'd like those unreferenced packs for other reasons anyway, so it's certainly worth a shot). But the whole race is kind of unlikely in the first place. If you have enough repositories, you see it eventually. ;) -Peff