Re: What's in a name? Let's use a (uuid,name,email) triplet

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 19 March 2010 12:16, Michael Witten <mfwitten@xxxxxxxxx> wrote:
> On Fri, Mar 19, 2010 at 06:09, Reece Dunn <msclrhd@xxxxxxxxxxxxxx> wrote:
>> On 19 March 2010 11:54, Mike Hommey <mh@xxxxxxxxxxxx> wrote:
>>> On Fri, Mar 19, 2010 at 04:45:38AM -0700, david@xxxxxxx wrote:
>>>> here is where you are missing the point.
>>>>
>>>> no, there is not 'much less chance' of it getting messed up.
>>>>
>>>> you seem to assume that people would never need to set the UUID on
>>>> multiple machines.
>>>>
>>>> if they don't need to set it on multiple machines, then the
>>>> e-mail/userid is going to be reliable anyway
>>>>
>>>> if they do need to set it on multiple machines and can't be bothered
>>>> to keep their e-mail consistant, why would they bother keeping this
>>>> additional thing considtant? Linus is pointing out that people don't
>>>> care now about their e-mail and name, and will care even less about
>>>> some abstract UUID
>>>>
>>>> people who care will already make their e-mail consistant.
>>>
>>> While I don't agree with the need for that uuid thing, I'd like to
>>> pinpoint that people who care can't necessarily make their e-mail
>>> consistant. For example, Linus used to use an @osdl.org address, and
>>> he now uses an @linux-foundation.org address. It's still the same Linus,
>>> but the (name, email) pair has legitimately changed.
>>
>> So create an aliases list that maps one (name,email) to another that
>> is from the same person. There is no need for an additional item (a
>> uuid) to solve this problem. It also means that searching on any
>> (name,email) pair will find the others, so you only need to
>> remember/find one of the identities for the person you are interested
>> in finding the commits for.
>>
>> AFAICS, mailmap is about correcting mistakes (primarily in the
>> reported name for a given email address). In this case, mailmap and
>> this aliases-map will work in conjunction with each other to give what
>> the original poster wanted. However, I haven't seen any of his replies
>> that answer this (or sufficiently address why mailmap does not solve
>> his problem).
>
> See:
>
>    http://marc.info/?l=git&m=126900051102958&w=2
>
> The idea is to distribute the responsibility for maintaining a
> consistent identity AND to make that responsibility EASY.
>
> The extra uuid `field' can only suffer from typos, while the
> name/email pair can suffer from typos, changing email accounts, and
> changing real life names. If the uuid `field' does get bungled by a
> typo or is not used, then we're no worse off than we were before.

What specific problem(s) are you trying to solve?

The main issue is identifying who made what changes to a repository
(e.g. by a script, or database/statistics algorithms). The mailmap
file allows for corrections to a canonical (name,email) pair for a
specified repository.

For identifying the same person working across multiple projects,
ideally they should keep the canonical (name,email) pair consistent
across all projects, with mailmap files in the respective projects to
keep the canonical form correct.

This canonical (name,email) pair is then a unique identifier for that
person and then effectively becomes a uuid. There is no need to add an
extra uuid field that needs *more* work fixing up errors and making
consistent.

If you change email address or name, *and* care enough about it being
consistent, there is no reason why you cannot update the mailmap file
to use the new canonical (name,email) pair.

Oh, and you are expressing it wrong (if I understand you correctly)...

What you are after is a string U (the uuid) that is used to identify a
person irrespective of their name and email. At the moment
   U = (name,email)
is used to achieve that, with mailmap to normalise the variations.

What you are trying to express is:
    U <=> (name,email)
where U can be any unique string. This is different from using a
(name,email,uuid) triple to identify someone.

So, lets say that I choose U=abc to identify myself uniquely, so that:
    "abc" <=> "Reece Dunn <msclrhd@xxxxxxxxx>"
    "abc" <=> "Reece Dunn <msclrhd@xxxxxxxxxxxxxx>"
    "abc" <=> "Reece Dunn <msclrhd@xxxxxxxxxxx>"
    "abc" <=> "Reece H. Dunn <msclrhd@xxxxxxxxx>"
    "abc" <=> "Reece H Dunn <msclrhd@xxxxxxxxx>"

I would still need to define all these variations when and as they
occur in a repository to fixup any typos and email address changes
that occur, so why not just pick U = "Reece H. Dunn
<msclrhd@xxxxxxxxx>" as the canonical form instead of "abc" or some
other string?

As has been said, mailmap supports name variations ("Reece Dunn",
"Reece H Dunn", "Reece H. Dunn") and email variations
(msclrhd@xxxxxxxxxxx, msclrhd@xxxxxxxxx, msclrhd@xxxxxxxxxxxxxx), so
how does a string that I need to set on the git client in addition to
name and email help me define a canonical form *in the git
repository*?

So, I'll ask again: what problems are you trying to solve that cannot
be solved by mailmap?

- Reece
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]