Re: [PATCH v1 00/19] Add configuration options for split-index

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 25, 2016 at 1:07 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
>>     - splitIndex.sharedIndexExpire
>>
>>     To make sure that old sharedindex files are eventually removed
>>     when a new one has been created, we "touch" the shared index file
>>     every time it is used by a new split index file. Then we can
>>     delete shared indexes with an mtime older than one week (by
>>     default), when we create a new shared index file. The new
>>     "splitIndex.sharedIndexExpire" config option lets people tweak
>>     this grace period.
>
> I do not quite understand this justification.  Doesn't each of the
> "this hold only changes since the base index file" files have a
> backpointer that names the base index file it is a delta against?

Yes, but the shared file does not have pointers to all the files that
need it, which could be more than one. We know one of them,
$GIT_DIR/index, and possibly $GIT_DIR/index.lock too. But those files
people generate manually and refer to them with $GIT_INDEX_FILE, we
can't know where they are.

> Is it safe to remove a base index file when there is still a split
> index file that points at it?
>
> IOW, I do not see why it can be safe for the expiration decision to
> be based on timestamp (I would understand it if it were based on a
> refcnt, though).

Problem is we can't maintain these ref counts cheap and simple. We
don't want to update sharedindex file every time somebody references
to it (or stops referencing to it) because that defeats the purpose of
splitting it out and not touching it any more. Adding a separate file
for ref count could work, but it gets complex, especially when we
think about race condition at update time.

Timestamps allow us to say, ok this base index file has not been read
by anybody for N+ hours (or better, days), it's most likely not
referenced by any temporary index files (including
$GIT_DIR/index.lock) anymore because those files, by the definition of
"temporary", must be gone by now. We should definitely check and make
sure the file $GIT_DIR/index points to still exist. I'm going to read
the series now, so I don't know if the previous sentence is true.

It will probably be harder to handle race condition at updating
$GIT_DIR/index, which could be avoided by a sufficiently long grace
period with timestamps.
-- 
Duy



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]