Re: git mailinfo strips important context from patch subjects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff King <peff@xxxxxxxx> writes:

> On Sun, Jun 28, 2009 at 08:38:58PM +0100, Roger Leigh wrote:
>
>> In most of the projects I work on, the git commit message has
>> the affected subsystem or component in square brackets, such as
>> 
>>   [foo] change bar to baz
>>
>> [...]
>>
>> The [sbuild] prefix has been dropped from the Subject, so an
>> important bit of context about the patch has been lost.
>> 
>> It's a bit of a bug that you can't round trip from a git-format-patch
>> to import with git-am and then not be able to produce the exact same
>> patch set with git-format-patch again (assuming preparing and applying
>> to the same point, of course).
>
> As an immediate solution, you probably want to use "-k" when generating
> the patch (not to add the [PATCH] munging) and "-k" when reading the
> patch via "git am" (which will avoid trying to strip any munging).
>
> However:
>
>> Would it be possible to change the git-mailinfo logic to use a less
>> greedy pattern match so it leaves everything after
>> ([PATCH( [0-9/])+])+ in the subject?  AFAICT this is cleanup_subject in
>> builtin-mailinfo.c?  Could this rather complex function not just do a
>> simple regex match which can also take care of stripping ([Rr]e:) ?
>
> Yes, I think in the long run it makes sense to strip just the _first_
> set of brackets. I don't think we want to be more specific than that in
> the match, because we allow arbitrary cruft inside the brackets (like
> "[RFC/PATCH]", etc). But if format-patch always puts exactly one set of
> brackets, and am strips exactly one set, then that should retain your
> subject in practice, even if it starts with [foo].

I think it may still make sense to insist that PATCH appears somewhere in
the first set of brackets, but I have stop and wonder if it is even
necessary.

Because git removes [sbuild] at the beginning, Roger is unhappy.

 * Is he happy that git removes [PATCH]?  In E-mail based workflow it is
   a good practice to mark messages that are patches clearly so that they
   can be quickly found among the discussions that lead to them, and it is
   plausible that his project accepted that as an established practice
   supported well by git.

 * Is he happy that git treats the first paragraph of the commit message
   specially from the rest of the message?  In a project with many
   commits, it is essential that people write good commit summaries that
   fits on a single line so that tools like shortlog and gitweb can be
   used to get a bird-eye view of what happened recently.  Perhaps his
   project picked it up as the best current practice supported well by
   git.

 * Is he happy that git takes "---" as the end of message marker, so that
   any other commentary can be added to the message to facilitate the
   communication without adding noise to the commits?  Perhaps he is and
   his project picked it up as a good practice supported well by git.

There are many other conventions in git that does not have anything to do
with what the underlying git datastructure supports, but conventions can
always be seen as "don't do that, instead do it this way", limitations,
and to some of them Roger may not be happy.  Where would we draw a line?

_An_ established (note that I did not say _the_ nor _best current_)
practice supported well by git to note the area being affected in a
project of nontrivial size is to prefix the single line summary with the
name of the area followed by a colon.  There is no difference between
"[sbuild] foo" and "sbuild: foo" at the information content point-of-view,
but the latter has an advantage of being one letter shorter and less
distracting in MUA.  He does not have a very strong reason to choose
something different only to make his life harder, does he?

Users can take advantage of this established practice when running
shortlog with "--grep=^area:" to limit the birds-eye-view to a specific
area.  If this turns out to be useful, we could even add an option to "git
log --area=name" that limits this kind of match to the first paragraph of
the commit log message, for example.

Supporting a slightly different convention may seem to be accomodating and
nice, but if there is no real technical difference between the two (and
again, "area:" is one letter shorter ;-), letting people run with
different convention longer, when they can switch easily to another
convention that is already well supported, may actually hurt them in the
long run.  "[sbuild]" will not match "--area=sbuild" that will internally
become "--grep-only-first-line=sbuild:" so either he will miss out
benefiting from the new feature, or the implementation of the new feature
unnecessarily needs more code.

It is not about discouraging a wrong workflow or practice, because there
is nothing _wrong_ per-se in [sbuild] prefix.  It is just that it makes
things harder in the long run.  In this particular case, it is only very
slightly harder, but these things tend to add up from different fronts.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]