On 2020-08-11 01:22:26-0400, Jeff King <peff@xxxxxxxx> wrote: > On Tue, Aug 11, 2020 at 07:33:59AM +0700, Đoàn Trần Công Danh wrote: > > > > Yeah, that's what I was getting at: if you care about robust > > > machine-readability, then the full index is the best solution. Reading > > > between the lines, I think the argument may be "using --full-index is > > > too long and therefore ugly, so people like the short-ish names but with > > > a bit of extra safety". > > > > My argument was people can either easily fetch the patch via HTTP like: > > > > curl -LO https://github.com/git/git/commit/eb12adc74cf22add318f884072be2071d181abaa.patch > > > > or take it from a mailing list archive, bugzilla, instead of > > cloning a full repository. With those options, we can't say, > > "we prefer full-index, please send us the patch with full-index > > instead". > > OK. But then how would they use "--abbrev" in that case? I.e., isn't it > too late at that point (especially in the mailing list archive case) to > do change anything in the formatting of the patch? > > Maybe I'm confused... > > > > There's an extra challenge here, which is that you have to convince the > > > sender to use the extra --abbrev option, even though they themselves > > > won't be the ones running into the problem when applying. > > > > Not really, since the sender tree is usually larger than the archived > > tree, their abbrev is usually long enough, and the receiver will use > > --abbrev to lengthen their abbrev to reduce the noise instead. > > Now I'm doubly confused. If the sender has the larger tree then they'll > have the larger abbrev. So what's the problem? > > Going back to re-read your earlier responses...So...this _isn't_ a > problem within Git itself? Correct. It's NOT Git's problems by any mean. > It's only about people trying to compare > textual patches byte-for-byte and seeing different index lines? Yeah, it's about people trying to backport patch to old tree. Fixing conflicts, and try to compare to old patch to see if they have made any errors. Because conflicts resolving is complicated. > > If that's the case, then it seems to me that the byte comparison is the > problem here. If I have: > > index 1234abcd..5678bcde > > and > > index 1234abcd87..5678bcde65 > > those should be considered equivalent to see if two patches are > plausibly the same. And I think tools like git-cherry, etc, would do > that (and we provide git-patch-id for that purpose, too). Yes, git-patch-id is very useful tool. There're time that half of the patch can be applied cleanly with the exact object names. Another half needs to be fixed heavily, (maybe removed). git-cherry and git-patch-id couldn't cope well in those situation. That condition is true if there's a major change to source tree. > > > Yeah, I certainly don't mind the extra flexibility between "full" and > > > "default" for "index" lines. I do wonder if people want to configure the > > > abbreviations for those lines separately from other parts. I don't know > > > that I've ever particularly cared about that flexibility, but the fact > > > that they were set up separately all those years ago makes me think > > > somebody might. > > > > I don't think people particularly care about the index line (and to > > the extent, its length) that much, since the default is number is > > actually a minimum number, if Git can't differentiate object with that > > number of characters, Git will show a longer object names anyway. > > > > I think most people scripts will put a regex for: > > > > /index [a-z0-9]{7,}\.\.[a-z0-9]{7,} [0-7]{6}/ > > > > Or even: > > > > /index [a-z0-9]+\.\.[a-z0-9]+ [0-7]+/ > > > > For the former case, we could change the code in 2/2 to set the minimum > > default to DEFAULT_ABBREV instead of MINIMUM_ABBREV? > > > > For the historical case that users put both --full-index and --abbrev > > into there scripts, we still keep our promise to not break their > > script by always respect --full-index, regardless of --abbrev. > > I care less about scripting (as you note, anything consuming abbreviated > objects has to handle longer-than-minimum names anyway), and was more > wondering whether anybody really cared that: > > git log --abbrev=30 -p > > kept the short index lines (e.g., because they're easier to read). But > I'm having trouble coming up with a plausible reason somebody would want > long object names in earlier lines like "Merge:" but not in the patch > index lines. And already we respect --abbrev for --raw, so it's not like > the diff code isn't already affected. Making "-p" consistent with all > the rest of it is probably worth doing regardless. Yes, I think this is the easier to accept argument. I've gone with that in the resend. -- Thanks. Danh