[RFC PATCH] diff: only prefetch for certain output formats

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Running "git status" on a partial clone that was cloned with
"--no-checkout", like this:

  git clone --no-checkout --filter=blob:none <url> foo
  git -C foo status

results in an unnecessary lazy fetch. This is because of commit
7fbbcb21b1 ("diff: batch fetching of missing blobs", 2019-04-08), which
optimized "diff" by prefetching, but inadvertently affected at least one
command ("status") that does not need prefetching. These 2 cases can be
distinguished by looking at output_format; the former uses
DIFF_FORMAT_PATCH and the latter uses DIFF_FORMAT_CALLBACK.

Therefore, restrict prefetching only to output_format values like the
one used by "diff".

(Note that other callers that use DIFF_FORMAT_CALLBACK will also lose
prefetching. I haven't investigated enough to see if this is a net
benefit or drawback, but I think this will need to be done on a
caller-by-caller basis and can be done in the future.)

Signed-off-by: Jonathan Tan <jonathantanmy@xxxxxxxxxx>
---
Points of discussion I can think of:

 - Is the whitelist of output_format constants the best?
 - Should we just have the callers pass a prefetch boolean arg instead
   of trying to guess it ourselves? I'm leaning towards no since I think
   we should avoid options unless they are necessary.

Perhaps there are others.
---
 diff.c                   | 10 +++++++++-
 t/t0410-partial-clone.sh | 13 +++++++++++++
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index 8e2914c031..8263491783 100644
--- a/diff.c
+++ b/diff.c
@@ -6504,7 +6504,15 @@ static void add_if_missing(struct repository *r,
 
 void diffcore_std(struct diff_options *options)
 {
-	if (options->repo == the_repository && has_promisor_remote()) {
+	int output_formats_to_prefetch = DIFF_FORMAT_RAW |
+		DIFF_FORMAT_DIFFSTAT |
+		DIFF_FORMAT_NUMSTAT |
+		DIFF_FORMAT_SUMMARY |
+		DIFF_FORMAT_PATCH |
+		DIFF_FORMAT_SHORTSTAT |
+		DIFF_FORMAT_DIRSTAT;
+	if ((options->output_format & output_formats_to_prefetch) &&
+	    options->repo == the_repository && has_promisor_remote()) {
 		/*
 		 * Prefetch the diff pairs that are about to be flushed.
 		 */
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index a3988bd4b8..d72631a3a0 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -567,6 +567,19 @@ test_expect_success 'do not fetch when checking existence of tree we construct o
 	git -C repo cherry-pick side1
 '
 
+test_expect_success 'do not fetch when running status against --no-checkout partial clone' '
+	rm -rf server client trace &&
+	test_create_repo server &&
+	test_commit -C server one &&
+
+	test_config -C server uploadpack.allowfilter 1 &&
+	git clone --no-checkout --filter=blob:none "file://$(pwd)/server" client &&
+
+	test_config -C server uploadpack.allowanysha1inwant 1 &&
+	GIT_TRACE_PACKET="$(pwd)/trace" git -C client status &&
+	! test_path_exists trace
+'
+
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 
-- 
2.25.0.341.g760bfbb309-goog




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux