On Thu, Oct 27, 2016 at 12:25 PM, Duy Nguyen <pclouds@xxxxxxxxx> wrote: > On Tue, Oct 25, 2016 at 5:43 PM, Duy Nguyen <pclouds@xxxxxxxxx> wrote: >>> static int write_shared_index(struct index_state *istate, >>> @@ -2211,8 +2269,11 @@ static int write_shared_index(struct index_state *istate, >>> } >>> ret = rename_tempfile(&temporary_sharedindex, >>> git_path("sharedindex.%s", sha1_to_hex(si->base->sha1))); >>> - if (!ret) >>> + if (!ret) { >>> hashcpy(si->base_sha1, si->base->sha1); >>> + clean_shared_index_files(sha1_to_hex(si->base->sha1)); >> >> This operation is technically garbage collection and should belong to >> "git gc --auto", which is already called automatically in a few >> places. Is it not called often enough that we need to do the cleaning >> up right after a new shared index is created? > > Christian, if we assume to go with Junio's suggestion to disable > split-index on temporary files, the only files left we have to take > care of are index and index.lock. I believe pruning here in this code > will have an advantage over in "git gc --auto" because when this is > executed, we know we're holding index.lock, so nobody else is updating > the index, it's race-free. All we need to do is peek in $GIT_DIR/index > to see what shared index file it requires and keep it alive too, the > remaining of shared index files can be deleted safely. We don't even > need to fall back to mtime. > > git-gc just can't match this because while it's running, somebody else > may be updating $GIT_DIR/index. Handling races would be a lot harder. Yeah, I agree that if we disable split-index on temporary files and on commands like "git read-tree --index-output=<file>" then it is the right thing to do it in write_shared_index(). (In fact when I wrote the previous RFC series I didn't think about those special cases, and that was the reason why I did it like this. So I just need to go back to the implementation that was in the previous RFC series.) I am just still wondering if disabling split-index on temporary files could not have a bad performance impact for some use cases, but I guess we could always come back to problem again if that happens. Thanks, Christian.