On Mon, Sep 23, 2013 at 02:37:29PM -0700, Jonathan Nieder wrote: > Hi, > > Michael S. Tsirkin wrote: > >> On Tue, Sep 17, 2013 at 04:56:16PM -0400, Jeff King wrote: > > >>>>> A problem with both schemes, though, is that they are not > >>>>> backwards-compatible with existing git-patch-id implementations. > [...] > >>> It may be esoteric enough not to worry about, though. > > Yeah, I think it would be okay. Details of the diff generation > algorithm have changed from time to time anyway (and broken things, > as you mentioned) and we make no guarantee about this. > > [...] > >> patch-id: make it more stable > >> > >> Add a new patch-id algorithm making it stable against > >> hunk reodering: > >> - prepend header to each hunk (if not there) > >> - calculate SHA1 hash for each hunk separately > >> - sum all hashes to get patch id > >> > >> Add --order-sensitive to get historical unstable behaviour. > > The --order-sensitive option seems confusing. How do I use it to > replicate a historical patch-id? You supply a historical diff to it :) > If I record all options that might > have influenced ordering (which are those?) then am I guaranteed to > get a reproducible result? Maybe not. But if you have a patch on disk, you will get old hash from it with --order-sensitive. > So I would prefer either of the following over the above: > > a) When asked to compute the patch-id of a seekable file, use the > current streaming implementation until you notice a filename that > is out of order. Then start over with sorted hunks (for example > building a table of offsets within the patch for each hunk to > support this). > > When asked to compute the patch-id of an unseekable file, stream > to a temporary file under $GIT_DIR to get a seekable file. This can be computed in one pass: just keep two checksums around. But the result won't be stable: if you get same patch from two people one is ordered, the other isn't, you get two different checksums. What are we trying to achieve here? > b) Unconditionally use the new patch-id definition that is stable > under permutation of hunks. If and when someone complains that > this invalidates their old patch-ids, they can work on adding a > nice interface for getting the old-style patch-ids. I suspect it > just wouldn't come up. That's certainly easy to implement. > Of course I can easily be wrong. Thanks for a clear patch that makes > the choices easy to reasonable about. > > Thoughts? > Jonathan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html