Re: [PATCH v1 16/19] read-cache: unlink old sharedindex files

Duy Nguyen <pclouds@xxxxxxxxx> · Thu, 27 Oct 2016 17:25:38 +0700

On Tue, Oct 25, 2016 at 5:43 PM, Duy Nguyen <pclouds@xxxxxxxxx> wrote:
>>  static int write_shared_index(struct index_state *istate,
>> @@ -2211,8 +2269,11 @@ static int write_shared_index(struct index_state *istate,
>>         }
>>         ret = rename_tempfile(&temporary_sharedindex,
>>                               git_path("sharedindex.%s", sha1_to_hex(si->base->sha1)));
>> -       if (!ret)
>> +       if (!ret) {
>>                 hashcpy(si->base_sha1, si->base->sha1);
>> +               clean_shared_index_files(sha1_to_hex(si->base->sha1));
>
> This operation is technically garbage collection and should belong to
> "git gc --auto", which is already called automatically in a few
> places. Is it not called often enough that we need to do the cleaning
> up right after a new shared index is created?

Christian, if we assume to go with Junio's suggestion to disable
split-index on temporary files, the only files left we have to take
care of are index and index.lock. I believe pruning here in this code
will have an advantage over in "git gc --auto" because when this is
executed, we know we're holding index.lock, so nobody else is updating
the index, it's race-free. All we need to do is peek in $GIT_DIR/index
to see what shared index file it requires and keep it alive too, the
remaining of shared index files can be deleted safely. We don't even
need to fall back to mtime.

git-gc just can't match this because while it's running, somebody else
may be updating $GIT_DIR/index. Handling races would be a lot harder.
-- 
Duy