Re: RFD: fast-import is picky with author names (and maybe it should - but how much so?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff King venit, vidit, dixit 08.11.2012 21:09:
> On Fri, Nov 02, 2012 at 03:43:24PM +0100, Michael J Gruber wrote:
> 
>> It seems that our fast-import is super picky with regards to author
>> names. I've encountered author names like
>>
>> Foo Bar<foo.bar@xxxxxxxx>
>> Foo Bar <foo.bar@xxxxxxxx
>> foo.bar@xxxxxxxx
>>
>> in the self-hosting repo of some other dvcs, and the question is how to
>> translate them faithfully into a git author name.
> 
> It is not just fast-import. Git's author field looks like an rfc822
> address, but it's much simpler. It fundamentally does not allow angle
> brackets in the "name" field, regardless of any quoting. As you noted in
> your followup, we strip them out if you provide them via
> GIT_AUTHOR_NAME.
> 
> I doubt this will change anytime soon due to the compatibility fallout.
> So it is up to generators of fast-import streams to decide how to encode
> what they get from another system (you could come up with an encoding
> scheme that represents angle brackets).

I don't expect our requirements to change. For one thing, I was
surprised that git-commit is more tolerant than git-fast-import, but it
makes a lot of sense to avoid any behind-the-back conversions in the
importer.

>> In general, we try to do
>>
>> fullotherdvcsname <none@none>
>>
>> if the other system's entry does not parse as a git author name, but
>> fast-import does not accept either of
>>
>> Foo Bar<foo.bar@xxxxxxxx> <none@none>
>> "Foo Bar<foo.bar@xxxxxxxx>" <none@none>
>>
>> because of the way it parses for <>. While the above could be easily
>> turned into
>>
>> Foo Bar <foo.bar@xxxxxxxx>
>>
>> it would not be a faithful representation of the original commit in the
>> other dvcs.
> 
> I'd think that if a remote system has names with angle brackets and
> email-looking things inside them, we would do better to stick them in
> the email field rather than putting in a useless <none@none>. The latter
> should only be used for systems that lack the information.
> 
> But that is a quality-of-implementation issue for the import scripts
> (and they may even want to have options, just like git-cvsimport allows
> mapping cvs usernames into full identities).

That was more my real concern. In our cvs and svn interfaces, we even
encourage the use of author maps. For example, if you use an author map,
git-svn errors out if it encounters an svn user name which is not in the
map. On the other hand, we can map all (most?) svn user names faithfully
without using a map (e.g. to "username <none@none>").

Hg seems to store just anything in the author field ("committer"). The
various interfaces that are floating around do some behind-the-back
conversion to git format. The more conversions they do, the better they
seem to work (no erroring out) but I'm wondering whether it's really a
good thing, or whether we should encourage a more diligent approach
which requires a user to map non-conforming author names wilfully.

Michael
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]