Re: [PATCH v7 03/12] update-index: add a new --force-write-index option

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 9/20/2017 1:47 AM, Junio C Hamano wrote:
Ben Peart <benpeart@xxxxxxxxxxxxx> writes:

+		OPT_SET_INT(0, "force-write-index", &force_write,
+			N_("write out the index even if is not flagged as changed"), 1),

Hmph.  The only time this makes difference is when the code forgets
to mark active_cache_changed even when it actually made a change to
the index, no?  I do understand the wish to be able to observe what
_would_ be written if such a bug did not exist in order to debug the
other aspects of the change in this series, but at the same time I
fear that we may end up sweeping the problem under the rug by
running the tests with this option.


This is to enable a performance optimization I discovered while perf testing the patch series. It enables us to do a lazy index write for fsmonitor detected changes but still always generate correct results.

Lets see how my ascii art skills do at describing this:

1) Index marked dirty on every fsmonitor change:
A---x---B---y---C

2) Index *not* marked dirty on fsmonitor changes:
A---x---B---x,y---C

Assume the index is written and up-to-date at point A.

In scenario #1 above, the index is marked fsmonitor dirty every time the fsmonitor detects a file that has been modified. At point B, the fsmonitor integration script returns that file 'x' has been modified since A, the index is marked dirty and then written to disk with a last_update time of B. At point C, the script returns 'y' as the changes since point B, the index is marked dirty and written to disk again.

In scenario #2, the index is *not* marked fsmonitor dirty when changed are detected. At point B, the script returns 'x' but the index is not flagged dirty nor written to disk. At point C, the script will return 'x' and 'y' (since both have been changed since time 'A') and again the index is not marked dirty nor written to disk.

Correct results are generated in both scenarios but in scenario 2, there were 2 fewer index writes. In short, the changed files were accumulated as the cost of processing 2 files at point C (vs 1) has no measurable difference in perf but the savings of two unnecessary index writes is significant (especially when the index gets large).

There is no real concern about accumulating too many changes as 1) the processing cost for additional modified files is fairly trivial and 2) the index ends up getting written out pretty frequently anyway as files are added/removed/staged/etc which updates the fsmonitor_last_update time.

The challenge came when it was time to test that the changes to the index were correct. Since they are lazily written by default, I needed a way to force the write so that I could verify the index on disk was correct. Hence, this patch.


  		OPT_END()
  	};
@@ -1147,7 +1150,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
  		die("BUG: bad untracked_cache value: %d", untracked_cache);
  	}
- if (active_cache_changed) {
+	if (active_cache_changed || force_write) {
  		if (newfd < 0) {
  			if (refresh_args.flags & REFRESH_QUIET)
  				exit(128);



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux