Re: [PATCH v2 5/5] sha1_file: support loading lazy objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 7/31/2017 5:02 PM, Jonathan Tan wrote:
Teach sha1_file to invoke the command configured in
extensions.lazyObject whenever an object is requested and unavailable.

The usage of the hook can be suppressed through a flag when invoking
has_object_file_with_flags() and other similar functions.

This is meant as a temporary measure to ensure that all Git commands
work in such a situation. Future patches will update some commands to
either tolerate missing objects (without invoking the command) or be
more efficient in invoking this command.

To prevent fetch from downloading all missing objects, you will also need to add logic in check_connected. The simplest model is to simply return 0 if repository_format_lazy_object is set.

/*
 * Running a with lazy_objects there will be objects that are
 * missing locally and we don't want to download a bunch of
 * commits, trees, and blobs just to make sure everything is
 * reachable locally so this option will skip reachablility
 * checks below that use rev-list.  This will stop the check
 * before uploadpack runs to determine if there is anything to
 * fetch.  Returning zero for the first check will also prevent the
 * uploadpack from happening.  It will also skip the check after
 * the fetch is finished to make sure all the objects where
 * downloaded in the pack file.  This will allow the fetch to
 * run and get all the latest tip commit ids for all the branches
 * in the fetch but not pull down commits, trees, or blobs via
 * upload pack.
 */
if (repository_format_lazy_object)
	return 0;

[...]
diff --git a/sha1_file.c b/sha1_file.c
index b60ae15f7..1785c61d8 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -28,6 +28,11 @@
  #include "list.h"
  #include "mergesort.h"
  #include "quote.h"
+#include "iterator.h"
+#include "dir-iterator.h"
+#include "sha1-lookup.h"
+#include "lazy-object.h"
+#include "sha1-array.h"
#define SZ_FMT PRIuMAX
  static inline uintmax_t sz_fmt(size_t s) { return s; }
@@ -2984,6 +2989,7 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
  	const unsigned char *real = (flags & OBJECT_INFO_LOOKUP_REPLACE) ?
  				    lookup_replace_object(sha1) :
  				    sha1;
+	int already_retried = 0;
if (!oi)
  		oi = &blank_oi;
@@ -3008,30 +3014,38 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
  		}
  	}
- if (!find_pack_entry(real, &e)) {
-		/* Most likely it's a loose object. */
-		if (!sha1_loose_object_info(real, oi, flags)) {
-			oi->whence = OI_LOOSE;
-			return 0;
-		}
+retry:
+	if (find_pack_entry(real, &e))
+		goto found_packed;
- /* Not a loose object; someone else may have just packed it. */
-		if (flags & OBJECT_INFO_QUICK) {
-			return -1;
-		} else {
-			reprepare_packed_git();
-			if (!find_pack_entry(real, &e))
-				return -1;
-		}
+	/* Most likely it's a loose object. */
+	if (!sha1_loose_object_info(real, oi, flags)) {
+		oi->whence = OI_LOOSE;
+		return 0;
  	}
+ /* Not a loose object; someone else may have just packed it. */
+	reprepare_packed_git();
+	if (find_pack_entry(real, &e))
+		goto found_packed;

Same feedback as before. I like to avoid using goto's as flow control other than in error handling.

Also, this patch looses the OBJECT_INFO_QUICK logic which could be restored.

[...]

diff --git a/t/t0410/lazy-object b/t/t0410/lazy-object
new file mode 100755
index 000000000..4f4a9c38a
--- /dev/null
+++ b/t/t0410/lazy-object
@@ -0,0 +1,102 @@
+#!/usr/bin/perl
+#
+# Example implementation for the Git lazyObject protocol version 1. See
+# the documentation for extensions.lazyObject in
+# Documentation/technical/repository-version.txt
+#
+# Allows you to test the ability for blobs to be pulled from a host git repo
+# "on demand."  Called when git needs a blob it couldn't find locally due to
+# a lazy clone that only cloned the commits and trees.
+#
+# Please note, this sample is a minimal skeleton. No proper error handling
+# was implemented.
+
+use strict;
+use warnings;
+
+#
+# Point $DIR to the folder where your host git repo is located so we can pull
+# missing objects from it
+#
+my $DIR = $ARGV[0];
+

At some point, this should be based on the refactored pkt_* functions currently contained in the ObjectDB patch series.

+sub packet_bin_read {
+	my $buffer;
+	my $bytes_read = read STDIN, $buffer, 4;
+	if ( $bytes_read == 0 ) {
+
+		# EOF - Git stopped talking to us!
+		exit();
+	}
+	elsif ( $bytes_read != 4 ) {
+		die "invalid packet: '$buffer'";
+	}
+	my $pkt_size = hex($buffer);
+	if ( $pkt_size == 0 ) {
+		return ( 1, "" );
+	}
+	elsif ( $pkt_size > 4 ) {
+		my $content_size = $pkt_size - 4;
+		$bytes_read = read STDIN, $buffer, $content_size;
+		if ( $bytes_read != $content_size ) {
+			die "invalid packet ($content_size bytes expected; $bytes_read bytes read)";
+		}
+		return ( 0, $buffer );
+	}
+	else {
+		die "invalid packet size: $pkt_size";
+	}
+}
+
+sub packet_txt_read {
+	my ( $res, $buf ) = packet_bin_read();
+	unless ( $buf =~ s/\n$// ) {
+		die "A non-binary line MUST be terminated by an LF.";
+	}
+	return ( $res, $buf );
+}
+
+sub packet_bin_write {
+	my $buf = shift;
+	print STDOUT sprintf( "%04x", length($buf) + 4 );
+	print STDOUT $buf;
+	STDOUT->flush();
+}
+
+sub packet_txt_write {
+	packet_bin_write( $_[0] . "\n" );
+}
+
+sub packet_flush {
+	print STDOUT sprintf( "%04x", 0 );
+	STDOUT->flush();
+}
+
+( packet_txt_read() eq ( 0, "git-lazy-object-client" ) ) || die "bad initialize";
+( packet_txt_read() eq ( 0, "version=1" ) )		 || die "bad version";
+( packet_bin_read() eq ( 1, "" ) )                       || die "bad version end";
+
+packet_txt_write("git-lazy-object-server");
+packet_txt_write("version=1");
+packet_flush();
+
+( packet_txt_read() eq ( 0, "capability=get" ) )    || die "bad capability";
+( packet_bin_read() eq ( 1, "" ) )                  || die "bad capability end";
+
+packet_txt_write("capability=get");
+packet_flush();
+
+while (1) {
+	my ($command) = packet_txt_read() =~ /^command=([^=]+)$/;
+
+	if ( $command eq "get" ) {
+		my ($sha1) = packet_txt_read() =~ /^sha1=([0-9a-f]{40})$/;
+		packet_bin_read();
+
+		system ('git --git-dir="' . $DIR . '" cat-file blob ' . $sha1 . ' | git -c extensions.lazyobject=false hash-object -w --stdin >/dev/null 2>&1');
+		packet_txt_write(($?) ? "status=error" : "status=success");
+		packet_flush();
+	} else {
+		die "bad command '$command'";
+	}
+}




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux