Re: [PATCH 3/3] git-fetch: avoid local fetching from alternate (again)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"Shawn O. Pearce" <spearce@xxxxxxxxxxx> wrote:
> Junio C Hamano <gitster@xxxxxxxxx> wrote:
> > "Shawn O. Pearce" <spearce@xxxxxxxxxxx> writes:
> > 
> > > I'm starting to suspect heap corruption again in builtin-fetch.
> > > This patch alters the malloc() calls we are doing and may be shifting
> > > something around just enough in memory to cause a data overwrite or
> > > something and that's why this tag just drops out of the linked list?
> > > But then why does that happen in the test suite but not outside.
> > > Maybe because the test suite is setting environment variables that
> > > I'm not and the impact of those combined with these additional
> > > mallocs is what is breaking it?  *sigh*
> 
> Found it.  This ain't pretty.  Remember, what's failing in the test
> suite is we aren't getting "tag-three-file" automatically followed
> during fetch.  This is an annotated tag referring to a blob.

Here's one possible way of fixing this.  I think what we want is to
include "--not --all" when we run the object listing immediately
after the fetch so we mark *only* those objects we just fetched
and ignore the ones we already had reachable through existing refs.
This makes the walking cost proportional to the size of the fetch
and not the size of the object database.

Unfortunately when you insert "--not" in front of "--all" in the args
array below the test vectors all fail again.  Apparently we already
have the blob that "tag-three-file" refers to so its not something
we walk over during this process and the tag isn't followed.
That happens because the test repository we are fetching into
already has a number of objects through its refs/remotes/origin/*
namespace created by git-clone and the blob is already considered
to be reachable.

I'm not formatting this as a real patch because I'm not yet sure this
is the right way to fix the automatic tag following code.  Or the
right way to abuse the revision walking machinary within git-fetch
to implement such a feature.  Comments would be most appreciated.  :)


--8>--
diff --git a/builtin-fetch.c b/builtin-fetch.c
index 847db73..77f1901 100644
--- a/builtin-fetch.c
+++ b/builtin-fetch.c
@@ -8,6 +8,9 @@
 #include "path-list.h"
 #include "remote.h"
 #include "transport.h"
+#include "diff.h"
+#include "list-objects.h"
+#include "revision.h"
 
 static const char fetch_usage[] = "git-fetch [-a | --append] [--upload-pack <upload-pack>] [-f | --force] [--no-tags] [-t | --tags] [-k | --keep] [-u | --update-head-ok] [--depth <depth>] [-v | --verbose] [<repository> <refspec>...]";
 
@@ -335,11 +338,43 @@ static void store_updated_refs(const char *url, struct ref *ref_map)
 	fclose(fp);
 }
 
+static void mark_just_fetched_commit(struct commit *commit)
+{
+}
+
+static void mark_just_fetched_object(struct object_array_entry *p)
+{
+	if (p->item->type == OBJ_BLOB && !has_sha1_file(p->item->sha1))
+		die("missing blob object '%s'", sha1_to_hex(p->item->sha1));
+}
+
 static int fetch_refs(struct transport *transport, struct ref *ref_map)
 {
 	int ret = transport_fetch_refs(transport, ref_map);
-	if (!ret)
+	if (!ret) {
+		const char *args[] = { "rev-list", "--objects", "--all", NULL};
+		struct rev_info revs;
+		struct ref *ref;
+
+		init_revisions(&revs, NULL);
+		setup_revisions(ARRAY_SIZE(args) - 1, args, &revs, NULL);
+		save_commit_buffer = 0;
+		track_object_refs = 0;
+
+		for (ref = ref_map; ref; ref = ref->next) {
+			struct object *o = parse_object(ref->old_sha1);
+			if (o)
+				add_pending_object(&revs, o, "(just-fetched)");
+		}
+
+		prepare_revision_walk(&revs);
+		mark_edges_uninteresting(revs.commits, &revs, NULL);
+		traverse_commit_list(&revs,
+			mark_just_fetched_commit,
+			mark_just_fetched_object);
+
 		store_updated_refs(transport->url, ref_map);
+	}
 	transport_unlock_pack(transport);
 	return ret;
 }
@@ -365,6 +400,7 @@ static struct ref *find_non_local_tags(struct transport *transport,
 	struct ref *ref_map = NULL;
 	struct ref **tail = &ref_map;
 	const struct ref *ref;
+	struct object *obj;
 
 	for_each_ref(add_existing, &existing_refs);
 	for (ref = transport_get_remote_refs(transport); ref; ref = ref->next) {
@@ -389,7 +425,8 @@ static struct ref *find_non_local_tags(struct transport *transport,
 
 		if (!path_list_has_path(&existing_refs, ref_name) &&
 		    !path_list_has_path(&new_refs, ref_name) &&
-		    lookup_object(ref->old_sha1)) {
+		    (obj = lookup_object(ref->old_sha1)) &&
+		    obj->flags & SEEN) {
 			path_list_insert(ref_name, &new_refs);
 
 			rm = alloc_ref(strlen(ref_name) + 1);
diff --git a/list-objects.c b/list-objects.c
index e5c88c2..78bf04c 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -170,4 +170,5 @@ void traverse_commit_list(struct rev_info *revs,
 	}
 	for (i = 0; i < objects.nr; i++)
 		show_object(&objects.objects[i]);
+	free(objects.objects);
 }

-- 
Shawn.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux