Re: [GSoC] Designing a faster index format

Shawn Pearce <spearce@xxxxxxxxxxx> · Fri, 6 Apr 2012 10:56:38 -0700

On Fri, Apr 6, 2012 at 10:23, Nguyen Thai Ngoc Duy <pclouds@xxxxxxxxx> wrote:
> On Sat, Apr 7, 2012 at 12:13 AM, Shawn Pearce <spearce@xxxxxxxxxxx> wrote:
>> On Fri, Apr 6, 2012 at 08:44, Nguyen Thai Ngoc Duy <pclouds@xxxxxxxxx> wrote:
>>> On Fri, Apr 6, 2012 at 10:24 PM, Thomas Rast <trast@xxxxxxxxxxxxxxx> wrote:
>>>> But even so: do we make any promises that (say) git-add is atomic in the
>>>> sense that a reader always gets the before-update results or the
>>>> after-update results?  Non-builtins (e.g. git add -p) may make small
>>>> incremental updates to the index, so they wouldn't be atomic anyway.
>>>
>>> Take git-checkout. I'm ok with it writing to worktree all old entries,
>>> or all new ones, but please not a mix.
>>
>> Why, what is the big deal? git-checkout has already written the file
>> to the local working tree. Its now just reflecting the updated stat
>> information in the index. If it does that after each file was touched,
>> and it aborts, you still have a partially updated working tree and the
>> index will show some updated files as stat clean, but staged relative
>> to HEAD. I don't think that is any better or worse than the current
>> situation where the working tree is shown as locally dirty but the
>> index has no staged files. Either way you have an aborted checkout to
>> recover from by retrying, or git reset --hard HEAD.
>>
>> In the retry case, checkout actually has less to do because the files
>> it already cleanly updated match where its going, and thus it doesn't
>> have to touch them again.
>
> OK, what about git-commit? If I read your description correctly, you
> can update entry sha-1 in place  too.

Yes.

> Running cache-tree on half old
> half new index definitely creates a broken commit.

How is that possible? Each tree also has its own SHA-1 field. A
process trying to update a tree's SHA-1 will have to snapshot the
tree's contents from the index by copying the data into its own memory
buffer so it can compute the canonical tree data buffer, write the
object to the repository, and get the tree's SHA-1. It writes that
tree's SHA-1 back to the index as of that snapshot. If there were
concurrent updates at the same time as git commit running, its the
same race condition that already exists. You don't know exactly where
in the execution of `git commit` it takes the snapshot of the index
that it uses to make the commit by opening the file. Allowing in place
updates means the snapshot time within git commit expands to be a
larger portion of its running time.

Basically I would argue it is already not safe to be modifying the
index while git commit is running. You don't know if git commit has
already opened the index file, or will open it after the edit. The
only way to be sure right now is to make your own copy of the index
and use GIT_INDEX_FILE environment variable to make sure git commit
uses the exact index you want.

> A command can also read (which does not require lock), update its
> internal index, then lock and write. At that time, it may accidentally
> overwrite whatever another command wrote while it was still preparing
> the index in memory.

This hypothetical command already has the bug you mention. It should
be fixed no matter what we do with regards to the index format.

The *only* safe way to update the index and prevent losing
modifications made by another process is to lock the index *then* read
it, update, write back out. If you read before you take the write
lock, you can discard edits made by another process. This is
preciously the reason why the JGit library always opens, reads, then
closes the index anytime the process wants to access an entry. We need
to make sure we are viewing the correct current version. Its even more
critical when the process wants to update the index, it *must* discard
any in-memory cached data it has and re-read the index *after* the
write lock has been successfully acquired.

IMHO the risks to the update in place approach is a few things, but
none of them really are a problem:

*  Readers *must* use the retry algorithm when looking at each record
anytime the CRC-32 on an individual entry doesn't match. Retry
requires using some form of backoff, because the concurrent writer
needs to be given time to finish the writes to the storage file. If a
reader doesn't correctly implement a retry, they could see corruption.

*  Readers *must* check the CRC-32 of any entry. In fact the best way
to read an entry is memcpy() the entry's stat/SHA-1/CRC-32 from the
index into another memory buffer, compute the checksum there, and
compare. This way the reader can be certain the entry isn't mutated
after it checked the CRC-32 but before it examined a particular stat
field. Again a buggy implementation reading from the index might not
implement this strategy and complain about corruption, or silently
process data with corruption.

*  A partial write will leave a corrupted index. E.g. a process
writing a record is killed before it has a chance to fully write out
the record's data. Nobody can read that record until it is repaired.
Repair should be possible with a combination of git reset --soft to
copy the SHA-1 from HEAD and recomputing the working tree's SHA-1 to
see if the file is really clean or not. It probably isn't, and the
stat data will reflect it as dirty after the repair. We may have to
put this sort of repair logic into `git status` and `git diff` as part
of the normal "fix clean stat" pass.

*  Appending conflicting stage information to the end of the file
during a merge can be risky. The append might be partial. This can be
fixed by the user by `git reset --hard HEAD` to abort the merge. A
partial append is probably only likely when the git merge aborted
anyway and hasn't even really left you with a sane state to try and
resolve conflicts in.

*  Truncating away the conflicting stage information on the end of the
file can be risky, if the file system doesn't truncate back correctly.
But I think we can detect this and repair. If every record has a
"conflict" bit set to 0 and all records CRC-32s are valid, and we hold
the write lock, we know any conflict data on the end is bogus and
should be truncated away, so we truncate again. If truncation isn't
working correctly on this filesystem, we rewrite the entire index
file.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html