"Michael S. Tsirkin" <mst@xxxxxxxxxx> writes: > +static void flush_one_hunk(unsigned char *result, git_SHA_CTX *ctx) > { > - int patchlen = 0, found_next = 0; > + unsigned char hash[20]; > + unsigned short carry = 0; > + int i; > + > + git_SHA1_Final(hash, ctx); > + git_SHA1_Init(ctx); > + /* 20-byte sum, with carry */ > + for (i = 0; i < 20; ++i) { > + carry += result[i] + hash[i]; > + result[i] = carry; > + carry >>= 8; > + } Was there a reason why bitwise xor is not sufficient for mixing these two indenendent hashes? If the 20-byte sums do not offer benefit over that, the code for bitwise xor would be certainly be simpler, I would imagine? > +} > +static int get_one_patchid(unsigned char *next_sha1, unsigned char *result, > + struct strbuf *line_buf, int stable) > +{ > + int patchlen = 0, found_next = 0, hunks = 0; > int before = -1, after = -1; > + git_SHA_CTX ctx, header_ctx; > + > + git_SHA1_Init(&ctx); > + hashclr(result); > > while (strbuf_getwholeline(line_buf, stdin, '\n') != EOF) { > char *line = line_buf->buf; > @@ -99,6 +116,18 @@ static int get_one_patchid(unsigned char *next_sha1, git_SHA_CTX *ctx, struct st > if (!memcmp(line, "@@ -", 4)) { > /* Parse next hunk, but ignore line numbers. */ > scan_hunk_header(line, &before, &after); > + if (stable) { > + if (hunks) { > + flush_one_hunk(result, &ctx); > + memcpy(&ctx, &header_ctx, > + sizeof ctx); > + } else { > + /* Save ctx for next hunk. */ > + memcpy(&header_ctx, &ctx, > + sizeof ctx); > + } > + } > + hunks++; > continue; > } > > @@ -107,7 +136,10 @@ static int get_one_patchid(unsigned char *next_sha1, git_SHA_CTX *ctx, struct st > break; > > /* Else we're parsing another header. */ > + if (stable && hunks) > + flush_one_hunk(result, &ctx); > before = after = -1; > + hunks = 0; > } > > /* If we get here, we're inside a hunk. */ > @@ -119,39 +151,46 @@ static int get_one_patchid(unsigned char *next_sha1, git_SHA_CTX *ctx, struct st > /* Compute the sha without whitespace */ > len = remove_space(line); > patchlen += len; > - git_SHA1_Update(ctx, line, len); > + git_SHA1_Update(&ctx, line, len); > } > > if (!found_next) > hashclr(next_sha1); > > + flush_one_hunk(result, &ctx); What I read from these changes is that you do not do anything special about the per-file header, so two no overlapping patches with a single hunk each that touches the same path concatenated together would not result in the same patch-id as a single-patch that has the same two hunks. Which would break your earlier 'Yes, reordering only the hunks will not make sense, but before each hunk you could insert the same "diff --git a/... b/..." to make them a concatenation of patches that touch the same file', I would think. Is that what we want to happen? Or is my reading mistaken? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html