On Tue, Sep 17, 2013 at 11:16:04PM +0300, Michael S. Tsirkin wrote: > > Thinking about it some more, it's a best effort thing anyway, > > correct? > > > > So how about, instead of doing a hash over the whole input, > > we hash each chunk and XOR them together? > > > > This way it will be stable against chunk reordering, and > > no need to keep patch in memory. > > > > Hmm? > > ENOCOFFEE > > That was a silly suggestion, two identical chunks aren't that unlikely :) In a single patch, they should not be, as we should be taking into account the filenames, no? You could also do it hierarchically. Hash each chunk, store only the hashes, then sort them and hash the result. That still has O(chunks) storage, but it is only one hash per chunk, not the whole data. A problem with both schemes, though, is that they are not backwards-compatible with existing git-patch-id implementations. Whereas sorting the data itself is (kind of, at least with respect to people who are not using orderfile). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html