Re: Git commit amend empty emails

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 11, 2014 at 12:17:35AM +1100, Simon wrote:

> Git is having empty email problems I think, I'm on git v2.1.3.
> 
> Steps to reproduce:
> 
>     $ git init
>     Initialized empty Git repository in /tmp/test_git/.git/
>     $ echo 'test' > abc
>     $ git add --all 1 ↵
>     $ git commit --message 'test'
>     [master (root-commit) 3cc2793] test
>      1 file changed, 1 insertion(+)
>      create mode 100644 abc
>     $ echo 'test2' > abc
>     $ git commit --amend
>     fatal: Malformed ident string: 'admin <> 1418217345 +0000'

Yes, this looks like a regression in v2.1.0, due to my 4701026 (commit:
use split_ident_line to compare author/committer, 2014-05-01).  That
commit added a call to sane_ident_split() when preparing the commit
message template, which is what is causing the error you see (and why
your first commit succeeds, but the latter fails).

The absolute simplest fix would be to remove the sane_ident_split call.
Though I have a different patch prepared, which argues that we should
not be doing any validation in this code path at all (see below).

However, there's something else going on. I am surprised that we allow
empty emails at all and the code here is quite strange. The first check
on the ident format is when we feed the data to fmt_ident to generate
the string that goes into the commit object.  We disallow empty _names_
there, but not empty _emails_.  I'm not sure if this is an oversight, or
an intentional historic compatibility thing.

And then it gets weirder. We take the output of fmt_ident, and then feed
that back into split_ident_line, so that we can set the GIT_AUTHOR_*
variables in the environment. And that does its _own_ set of checks, via
sane_ident_split.

Once upon a time, it relied only on split_ident_lane to report problems.
But Junio's e27ddb6 (split_ident_line(): make best effort when parsing
author/committer line, 2012-08-31) made split_ident_line more lenient,
and introduced sane_ident_split to cover the difference. Except that it
did more than that: besides checking whether the name is empty (which
the original split_ident_line used to do), it also complains if the
email is empty (which is new in that commit).

So we now notice the empty email in this code path, but the only thing
we do is avoid writing out the environment variables and continue. Which
means that the actual string generated by fmt_ident (complete with empty
email) is what goes into the commit. So why are we setting the
environment variables at all?

It looks like they are for the benefit of hooks, as evidenced by 7dfe8ad
(commit: pass author/committer info to hooks, 2012-03-11). So that means
that we are sometimes passing totally bogus data to hooks (i.e., an
ident is good enough to make it into a commit message, but
sane_ident_split prevents us from putting the data into the
environment).

So I think e27ddb6 was a little over-anxious in adding the email
validation, but nobody noticed because those checks don't actually
affect whether or not we commit. Hooks might see bogus data, but it is
in such a rare set of cases that nobody has noticed.

Here are two patches to improve this. These are on top of the
jk/commit-date-approxidate topic, as that is where the regression was
introduced.

The first one fixes the regression and can stand by itself. The second
fixes the GIT_AUTHOR problem, but AFAIK that has been there for years.
So it is not as urgent, but is still maint-worthy, in my opinion.

  [1/2]: commit: loosen ident checks when generating template
  [2/2]: commit: always populate GIT_AUTHOR_* variables

If we did want to truly disallow empty emails, we could do a follow-on
3/2 that teaches fmt_ident to reject them (that is the right place
because it is where the validation checks for the author go, and also
because we would probably want the same validation for the committer).

But I do not think we should do that lightly. It has been this way for
years, and clearly at least one person is depending on it. If we're
going to change it, we might want a warning/deprecation period.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]