[PATCH v4 0/6] Fix the racy split index problem

SZEDER Gábor <szeder.dev@xxxxxxxxx> · Thu, 11 Oct 2018 11:43:03 +0200

Fourth and hopefully final round of fixing occasional test failures when
run with 'GIT_TEST_SPLIT_INDEX=yes'.  The only code change is the
extraction of a helper function to compare two cache entries' content,
and then a couple of minor log message clarifications.  The range-diff
below is rather clear on that.

I will send a 7/6 follow-up patch shortly as well.

SZEDER Gábor (6):
  t1700-split-index: document why FSMONITOR is disabled in this test
    script
  split-index: add tests to demonstrate the racy split index problem
  t1700-split-index: date back files to avoid racy situations
  split-index: count the number of deleted entries
  split-index: don't compare cached data of entries already marked for
    split index
  split-index: smudge and add racily clean cache entries to split index

 cache.h                     |   2 +
 read-cache.c                |   2 +-
 split-index.c               | 131 +++++++++++++++++++---
 t/t1700-split-index.sh      |  52 +++++----
 t/t1701-racy-split-index.sh | 214 ++++++++++++++++++++++++++++++++++++
 5 files changed, 361 insertions(+), 40 deletions(-)
 create mode 100755 t/t1701-racy-split-index.sh

Range-diff:
1:  ba2b1bdf16 = 1:  ba2b1bdf16 t1700-split-index: document why FSMONITOR is disabled in this test script
2:  bf1b038f10 ! 2:  c7cb9d9115 split-index: add tests to demonstrate the racy split index problem
    @@ -136,13 +136,20 @@
              git commands will then erroneously consider the file clean.

         Note that in the last two 'test_expect_failure' cases I omitted the
    -    '#' (as in nr. of trial) from the tests' name on purpose for now, as
    -    it confuses 'prove' into thinking that those tests failed
    -    unexpectedly.
    +    '#' (as in nr. of trial) from the tests' description on purpose for
    +    now, as it breakes the TAP output [2]; it will be added at the end of
    +    the series, when those two tests will be flipped to
    +    'test_expect_success'.

         [1] In the branch leading to the merge commit v2.1.0-rc0~45 (Merge
             branch 'nd/split-index', 2014-07-16).

    +    [2] In the TAP output a '#' should separate the test's description
    +        from the TODO directive emitted by 'test_expect_failure'.  The
    +        additional '#' in "#$trial" interferes with this, the test harness
    +        won't recognize the TODO directive, and will report that those
    +        tests failed unexpectedly.
    +
         Signed-off-by: SZEDER Gábor <szeder.dev@xxxxxxxxx>

      diff --git a/t/t1701-racy-split-index.sh b/t/t1701-racy-split-index.sh
3:  e7f7fb6e2d = 3:  ce803d8064 t1700-split-index: date back files to avoid racy situations
4:  6dc0b01ad3 = 4:  1d12d718d1 split-index: count the number of deleted entries
5:  9c420f9c66 ! 5:  0dd448c707 split-index: don't compare stat data of entries already marked for split index
    @@ -1,6 +1,6 @@
     Author: SZEDER Gábor <szeder.dev@xxxxxxxxx>

    -    split-index: don't compare stat data of entries already marked for split index
    +    split-index: don't compare cached data of entries already marked for split index

         When unpack_trees() constructs a new index, it copies cache entries
         from the original index [1].  prepare_to_write_split_index() has to
    @@ -20,7 +20,9 @@

         So modify prepare_to_write_split_index() to check the copied cache
         entries' CE_UPDATE_IN_BASE flag first, and skip the thorough
    -    comparison of cached data if the flag is already set.
    +    comparison of cached data if the flag is already set.  Those couple of
    +    lines comparing the cached data would then have too many levels of
    +    indentation, so extract them into a helper function.

         Note that comparing the cached data in copied and original entries in
         the shared index might actually be entirely unnecessary.  In theory
    @@ -62,6 +64,37 @@
      diff --git a/split-index.c b/split-index.c
      --- a/split-index.c
      +++ b/split-index.c
    +@@
    + 	si->saved_cache_nr = 0;
    + }
    + 
    ++/*
    ++ * Compare most of the fields in two cache entries, i.e. all except the
    ++ * hashmap_entry and the name.
    ++ */
    ++static int compare_ce_content(struct cache_entry *a, struct cache_entry *b)
    ++{
    ++	const unsigned int ondisk_flags = CE_STAGEMASK | CE_VALID |
    ++					  CE_EXTENDED_FLAGS;
    ++	unsigned int ce_flags = a->ce_flags;
    ++	unsigned int base_flags = b->ce_flags;
    ++	int ret;
    ++
    ++	/* only on-disk flags matter */
    ++	a->ce_flags &= ondisk_flags;
    ++	b->ce_flags &= ondisk_flags;
    ++	ret = memcmp(&a->ce_stat_data, &b->ce_stat_data,
    ++		     offsetof(struct cache_entry, name) -
    ++		     offsetof(struct cache_entry, ce_stat_data));
    ++	a->ce_flags = ce_flags;
    ++	b->ce_flags = base_flags;
    ++
    ++	return ret;
    ++}
    ++
    + void prepare_to_write_split_index(struct index_state *istate)
    + {
    + 	struct split_index *si = init_split_index(istate);
     @@
      		 */
      		for (i = 0; i < istate->cache_nr; i++) {
    @@ -137,21 +170,7 @@
     +				 * code paths modifying the cached data do
     +				 * set CE_UPDATE_IN_BASE as well.
     +				 */
    -+				const unsigned int ondisk_flags =
    -+					CE_STAGEMASK | CE_VALID |
    -+					CE_EXTENDED_FLAGS;
    -+				unsigned int ce_flags, base_flags, ret;
    -+				ce_flags = ce->ce_flags;
    -+				base_flags = base->ce_flags;
    -+				/* only on-disk flags matter */
    -+				ce->ce_flags   &= ondisk_flags;
    -+				base->ce_flags &= ondisk_flags;
    -+				ret = memcmp(&ce->ce_stat_data, &base->ce_stat_data,
    -+					     offsetof(struct cache_entry, name) -
    -+					     offsetof(struct cache_entry, ce_stat_data));
    -+				ce->ce_flags = ce_flags;
    -+				base->ce_flags = base_flags;
    -+				if (ret)
    ++				if (compare_ce_content(ce, base))
     +					ce->ce_flags |= CE_UPDATE_IN_BASE;
     +			}
      			discard_cache_entry(base);
6:  52c755f210 ! 6:  384b440345 split-index: smudge and add racily clean cache entries to split index
    @@ -46,6 +46,11 @@
         racily clean cache entries as well, and will then write them with
         smudged stat data to the new split index.

    +    This change makes all tests in 't1701-racy-split-index.sh' pass, so
    +    flip the two 'test_expect_failure' tests to success.  Also add the '#'
    +    (as in nr. of trial) to those tests' description that were omitted
    +    when the tests expected failure.
    +
         Note that after this change if the index is split when it contains a
         racily clean cache entry, then a smudged cache entry will be written
         both to the new shared and to the new split indexes.  This doesn't
-- 
2.19.1.465.gaff195083f