On Thu, Jun 21, 2012 at 9:57 AM, Jonathan Nieder <jrnieder@xxxxxxxxx> wrote: > Hi Dmitry, > > Dmitry Ivankov wrote: > >> Empty branches (either new or reset-ed) have null_sha1 in fast-import >> internals. These null_sha1 heads can slip to the real commit objects. > [... nice explanation snipped ...] > > Very nice, thanks much for this. > > Would it be possible to split this into multiple independent fixes? > See [*] below for one way. > >> --- a/fast-import.c >> +++ b/fast-import.c >> @@ -2540,7 +2540,8 @@ static void file_change_deleteall(struct branch *b) >> b->num_notes = 0; >> } >> >> -static void parse_from_commit(struct branch *b, char *buf, unsigned long size) >> +static void parse_from_commit(struct branch *b, unsigned char *sha1, >> + char *buf, unsigned long size) >> { > > What is happening here? The new argument doesn't seem to be used. Ow, sorry. In fact it is here for die() messages. A bit like spaghetti code, maybe die()-s can be moved to the caller. > > [...] >> @@ -2551,29 +2552,31 @@ static void parse_from_commit(struct branch *b, char *buf, unsigned long size) >> b->branch_tree.versions[1].sha1); >> } >> >> -static void parse_from_existing(struct branch *b) >> +static void parse_from_existing(struct branch *b, unsigned char *sha1) >> { >> - if (is_null_sha1(b->sha1)) { >> + if (is_null_sha1(sha1)) { >> hashclr(b->branch_tree.versions[0].sha1); >> hashclr(b->branch_tree.versions[1].sha1); >> } else { >> unsigned long size; >> char *buf; >> >> - buf = read_object_with_reference(b->sha1, >> - commit_type, &size, b->sha1); >> + buf = read_object_with_reference(sha1, >> + commit_type, &size, sha1); > > This seems to be about delaying the effect of "from" so it doesn't > interfere with a "merge" command referring to the same commit. Kind of, parse_from_*() arrange b->sha1 and b->branch_tree to correspond the first parent of a new tip. With this patch b->sha1 is left alone to avoid the interference. Oh, yes, luckily parse_merge() doesn't need b->branch_tree-s. > > [...] >> -static int parse_from(struct branch *b) >> +static int parse_from(struct branch *b, unsigned char *sha1out) >> { >> const char *from; >> struct branch *s; >> >> - if (prefixcmp(command_buf.buf, "from ")) >> + if (prefixcmp(command_buf.buf, "from ")) { >> + hashclr(sha1out); >> return 0; >> + } > > This code path handles the case where there is no "from" after a > "reset" or "commit" command. We clear sha1out to make the calling > convention simple --- sha1out is always written to, and the caller > does not have to worry about initializing it in advance. > > I guess this is part of the change that delays the effect of "from"? > > [...] >> @@ -2586,7 +2589,10 @@ static int parse_from(struct branch *b) >> die("Can't create a branch from itself: %s", b->name); >> else if (s) { >> unsigned char *t = s->branch_tree.versions[1].sha1; >> + if (is_null_sha1(s->sha1)) >> + die("Can't create a branch from an empty branch:" >> + " %s from %s", b->name, s->name); > > This seems to be about protecting against "from" with an unborn > branch. > >> - hashcpy(b->sha1, s->sha1); >> + hashcpy(sha1out, s->sha1); > > Delaying the effect of "from", maybe. > > [...] >> @@ -2594,16 +2600,16 @@ static int parse_from(struct branch *b) >> struct object_entry *oe = find_mark(idnum); >> if (oe->type != OBJ_COMMIT) >> die("Mark :%" PRIuMAX " not a commit", idnum); >> - hashcpy(b->sha1, oe->idx.sha1); >> + hashcpy(sha1out, oe->idx.sha1); > > Delaying "from" effect? > >> if (oe->pack_id != MAX_PACK_ID) { >> unsigned long size; >> char *buf = gfi_unpack_entry(oe, &size); >> - parse_from_commit(b, buf, size); >> + parse_from_commit(b, sha1out, buf, size); > > Likewise? > >> free(buf); >> } else >> - parse_from_existing(b); >> + parse_from_existing(b, sha1out); >> - } else if (!get_sha1(from, b->sha1)) >> + } else if (!get_sha1(from, sha1out)) >> - parse_from_existing(b); >> + parse_from_existing(b, sha1out); > > Likewise? > > [...] >> else >> die("Invalid ref name or SHA1 expression: %s", from); >> >> @@ -2622,9 +2628,11 @@ static struct hash_list *parse_merge(unsigned int *count) >> from = strchr(command_buf.buf, ' ') + 1; >> n = xmalloc(sizeof(*n)); >> s = lookup_branch(from); >> - if (s) >> + if (s) { >> + if (is_null_sha1(s->sha1)) >> + die("Can't merge empty branch: %s", s->name); >> hashcpy(n->sha1, s->sha1); >> - else if (*from == ':') { >> + } else if (*from == ':') { > > Protecting against "merge" with unknown branch. > > [...] >> @@ -2656,6 +2664,7 @@ static void parse_new_commit(void) >> { >> static struct strbuf msg = STRBUF_INIT; >> struct branch *b; >> + unsigned char sha1[20]; >> char *sp; >> char *author = NULL; >> char *committer = NULL; >> @@ -2683,7 +2692,7 @@ static void parse_new_commit(void) >> die("Expected committer but didn't get one"); >> parse_data(&msg, 0, NULL); >> read_next_command(); >> - parse_from(b); >> + parse_from(b, sha1); > > Delayed "from" effect? > > [...] >> @@ -2730,7 +2739,9 @@ static void parse_new_commit(void) >> strbuf_reset(&new_data); >> strbuf_addf(&new_data, "tree %s\n", >> sha1_to_hex(b->branch_tree.versions[1].sha1)); >> - if (!is_null_sha1(b->sha1)) >> + if (!is_null_sha1(sha1)) >> + strbuf_addf(&new_data, "parent %s\n", sha1_to_hex(sha1)); >> + else if (!is_null_sha1(b->sha1)) >> strbuf_addf(&new_data, "parent %s\n", sha1_to_hex(b->sha1)); > > Delaying "from"? > > [...] >> @@ -2855,7 +2866,7 @@ static void parse_reset_branch(void) >> else >> b = new_branch(sp); >> read_next_command(); >> - parse_from(b); >> + parse_from(b, b->sha1); > > Likewise? > > [*] > So it looks like this would be easier to read as three patches: > > 1. protecting against "from" of unborn branch > 2. protecting against "merge" of unborn branch > 3. delaying the effect of "from" to avoid it confusingly changing > the effect of a "merge" in the same commit Thanks, will look into it. Without (3) we'll still have inputs asking for (2) accepted, like: commit new_tip ... from old_tip merge new_tip new_tip won't end up with null_sha1 parent written, so should be ok to postpone this case to (3). > > (1) and (2) would be no-brainers, while (3) seems more subtle --- > maybe it should be documented to help importers for other version > control systems to know to make the same change? Will add a note on this. > > [...] >> --- a/t/t9300-fast-import.sh >> +++ b/t/t9300-fast-import.sh >> @@ -2895,6 +2895,54 @@ test_expect_success 'S: merge with garbage after mark must fail' ' >> test_i18ngrep "after mark" err >> ' >> >> +test_expect_success 'S: empty branch as merge parent must fail' ' >> + test_must_fail git fast-import <<-EOF 2>err && >> + commit refs/heads/chicken >> + committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE >> + data <<COMMIT >> + I am the chicken. >> + COMMIT >> + merge refs/heads/chicken >> + EOF >> + cat err && >> + test_must_fail git rev-parse --verify refs/heads/chicken^ > > There would be no "chicken" branch after this import at all, right? Oh, yes, these rev-parse are all to illustrate the reason why import must fail. With this one we can add a positive test where chicken is a born branch and import succeeds. > > [...] >> +test_expect_success 'S: empty branch as merge parent must fail (2)' ' >> + test_must_fail git fast-import <<-EOF 2>err && >> + commit refs/heads/egg1 >> + committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE >> + data <<COMMIT >> + I am the egg N1. >> + COMMIT >> + >> + commit refs/heads/egg2 >> + committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE >> + data <<COMMIT >> + I am the egg N2. >> + COMMIT >> + from refs/heads/egg1 >> + merge refs/heads/egg2 >> + EOF >> + cat err && >> + test_must_fail git rev-parse --verify refs/heads/egg2^2 > > Likewise for egg2. Adding a positive one here too looks a good idea. > > [...] >> +test_expect_success 'S: empty branch as a parent must fail ' ' >> + test_must_fail git fast-import <<-EOF 2>err && >> + reset refs/heads/egg3 >> + >> + commit refs/heads/egg4 >> + committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE >> + data <<COMMIT >> + I am the egg N4. >> + COMMIT >> + from refs/heads/egg3 >> + EOF >> + cat err && >> + test_must_fail git rev-parse --verify refs/heads/egg4^2 > > Likewise for egg4. There is a copy-paste typo "egg4^2" instead of "egg4". Will just drop rev-parse here as there should be a positive merge test somewhere already. > > If this were split up as described above, I imagine it would be much > easier to read and most of it would move into the "obviously good" > category (I'm still uncertain about some details of the "delayed from" > implementation and haven't checked carefully whether it misses any > spots). What do you think? Thanks, good idea indeed. Will rearrange and resend. > > Thanks for a pleasant read, and hope that helps, > Jonathan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html