On Fri, Mar 10, 2017 at 03:43:43PM -0800, Junio C Hamano wrote: > James Melvin <jmelvin@xxxxxxxxxxxxxx> writes: > > > The new --preserve-and-prune option renames old pack files > > instead of deleting them after repacking and prunes previously > > preserved pack files. > > > > This option is designed to prevent stale file handle exceptions > > during git operations which can happen on users of NFS repos when > > repacking is done on them. The strategy is to preserve old pack files > > around until the next repack with the hopes that they will become > > unreferenced by then and not cause any exceptions to running processes > > when they are finally deleted (pruned). > > This certainly is simpler than the previous one, but if you run > > git repack --preserve-and-prune > sleep for N minutes > git repack --preserve-and-prune > > the second "repack" will drop the obsoleted packs that were > preserved by the first one no matter how short the value of N is, > no? > > I suspect that users would be more comfortable with something based > on expiration timestamp that gives them a guaranteed grace period, > e.g. "--preserve-expire=8.hours", like how we expire reflog entries > and loose objects. > > Perhaps builtin/prune.c can be a source of inspiration? I have been a bit slow to read this series, but FWIW I had the exact same thought while reading up to this point. There should be an expiration, and that expiration time corresponds roughly to "how long do you expect a long-running git operation to last". You'd probably want "--preserve-expire" as an option to repack, and then a "gc.preservePackExpire" option so that "git gc" triggers it reliably. I can think of one downside of a time-based solution, though: if you run multiple gc's during the time period, you may end up using a lot of disk space (one repo's worth per gc). But that's a fundamental tension in the problem space; the whole point is to waste disk to keep helping old processes. But you may want a knob that lets you slide between those two things. For instance, if you kept a sliding window of N sets of preserved packs, and ejected from one end of the window (regardless of time), while inserting into the other end. James' existing patch is that same strategy with a hardcoded window of "1". The other variable you can manipulate is whether to gc in the first place. E.g., don't gc if there are N preserved sets (or sets consuming more than N bytes, or whatever). You could do that check outside of git entirely (or in an auto-gc hook, if you're using it). Note that something like a sliding window gets complicated pretty quickly. I'm not really proposing it as a direction; I'm just trying to think of ways the time-based system could go wrong. IMHO it would probably be fine to just ignore the problem and assume that repacking doesn't happen often enough for it to come up in practice. -Peff