On Tue, Oct 25, 2016 at 1:07 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote: >> - splitIndex.sharedIndexExpire >> >> To make sure that old sharedindex files are eventually removed >> when a new one has been created, we "touch" the shared index file >> every time it is used by a new split index file. Then we can >> delete shared indexes with an mtime older than one week (by >> default), when we create a new shared index file. The new >> "splitIndex.sharedIndexExpire" config option lets people tweak >> this grace period. > > I do not quite understand this justification. Doesn't each of the > "this hold only changes since the base index file" files have a > backpointer that names the base index file it is a delta against? Yes, but the shared file does not have pointers to all the files that need it, which could be more than one. We know one of them, $GIT_DIR/index, and possibly $GIT_DIR/index.lock too. But those files people generate manually and refer to them with $GIT_INDEX_FILE, we can't know where they are. > Is it safe to remove a base index file when there is still a split > index file that points at it? > > IOW, I do not see why it can be safe for the expiration decision to > be based on timestamp (I would understand it if it were based on a > refcnt, though). Problem is we can't maintain these ref counts cheap and simple. We don't want to update sharedindex file every time somebody references to it (or stops referencing to it) because that defeats the purpose of splitting it out and not touching it any more. Adding a separate file for ref count could work, but it gets complex, especially when we think about race condition at update time. Timestamps allow us to say, ok this base index file has not been read by anybody for N+ hours (or better, days), it's most likely not referenced by any temporary index files (including $GIT_DIR/index.lock) anymore because those files, by the definition of "temporary", must be gone by now. We should definitely check and make sure the file $GIT_DIR/index points to still exist. I'm going to read the series now, so I don't know if the previous sentence is true. It will probably be harder to handle race condition at updating $GIT_DIR/index, which could be avoided by a sufficiently long grace period with timestamps. -- Duy