Re: [PATCH] diff: add a config option to control orderfile

"Michael S. Tsirkin" <mst@xxxxxxxxxx> · Tue, 17 Sep 2013 23:31:31 +0300

On Tue, Sep 17, 2013 at 11:16:04PM +0300, Michael S. Tsirkin wrote:
> On Tue, Sep 17, 2013 at 11:14:01PM +0300, Michael S. Tsirkin wrote:
> > On Tue, Sep 17, 2013 at 11:06:07AM -0700, Junio C Hamano wrote:
> > > "Michael S. Tsirkin" <mst@xxxxxxxxxx> writes:
> > > 
> > > > On Tue, Sep 17, 2013 at 10:24:19AM -0700, Junio C Hamano wrote:
> > > >> "Michael S. Tsirkin" <mst@xxxxxxxxxx> writes:
> > > >> 
> > > >> > So might it not be useful to tweak patch id to
> > > >> > sort the diff, making it a bit more stable?
> > > >> 
> > > >> That is one thing that needs to be done, I think.  But it would be
> > > >> unfortunate if we have to do that unconditionally, though, as we may
> > > >> be "buffering" many hundred kilobytes of patch text in core.  If we
> > > >> can do so without regressing the streaming performance for the most
> > > >> common case of not using the orderfile on the generating side (hence
> > > >> not having to sort on the receiving end), it would be ideal.  I am
> > > >> not sure offhand how much code damage we are talking about, though.
> > > >
> > > > So make it conditional on the presence of the orderefile option?
> > > 
> > > That would mean that those who set orderfile from configuration in
> > > the future will have to always suffer, I would think.  Is that
> > > acceptable?  I dunno.
> > > 
> > > Also, if the sender used a non-standard order, the recipient does
> > > not know what order the patch was generated, and the recipient does
> > > not use a custom orderfile, what should happen?  I thought your idea
> > > was to normalize by using some canonical order that is not affected
> > > by the orderfile to make sure patch-id stays stable, so I would
> > > imagine that such a recipient who does not have orderfile specified
> > > still needs to sort before hashing, no?
> > 
> > Thinking about it some more, it's a best effort thing anyway,
> > correct?
> > 
> > So how about, instead of doing a hash over the whole input,
> > we hash each chunk and XOR them together?
> > 
> > This way it will be stable against chunk reordering, and
> > no need to keep patch in memory.
> > 
> > Hmm?
> 
> ENOCOFFEE
> 
> That was a silly suggestion, two identical chunks aren't that unlikely :)

OTOH we can detect such malformed patches just by keeping the chunk
hashes in memory...

> > -- 
> > MST
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html