Re: git bugs

"Ben Lynn" <benlynn@xxxxxxxxx> · Wed, 11 Jun 2008 11:48:10 -0700

> So we don't want to smudge it, but if the stat information says it migth
> match even though it doesn't, we have to. But if the stat information says
> it matches, and the data actually _does_ match, then we shouldn't smudge
> it, we should be happy - and all subsequent users of the index will then
> know that they don't even need to look at the file contents.

I understand. In that case, what about an unsmudging routine so we can
have the best of both worlds? We unconditionally smudge the file as
soon as timestamp = mtime is detected. We never do index-wide smudging
on writes, but rather, on index read of particular file we do this:

  if (stats_differ()) {
    if (hash_matches()) {
      // Aha! An unconditionally smudged file that we might be able to unsmudge,
      // so future reads can avoid this check.
      if (mtime < timestamp) {
        fix_stats();  // Involves writing to index.
      }
      // D'oh! This file could still be racy, leave it smudged.
    } else {
      // The stats were right, the hash does differ.
      ...
    }
  }

We minimize the amount of ce_check_modified_fs() calls for any given
sequence of index operations. Instead of doing index-wide checks on
every write, we only check when we have no choice, i.e. the first time
a particular file is being looked up in the index. We fix its stats if
possible so future reads can avoid the painful check. The only
drawback is that I'm not sure how acceptable it is to write to the
index on a read operation. Is this a big deal? (If it were a separate
flag, I'm sure no one would mind!)

-Ben
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html