Re: [PATCH 0/6] Improved infrastructure for refname normalization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/09/2011 04:06 PM, A Large Angry SCM wrote:
> On 09/09/2011 07:46 AM, Michael Haggerty wrote:
>> As a prerequisite to storing references caches hierarchically (itself
>> needed for performance reasons), here is a patch series to help us get
>> refname normalization under control.
>>
>> The problem is that some UI accepts unnormalized reference names (like
>> "/foo/bar" or "foo///bar" instead of "foo/bar") and passes them on to
>> library routines without normalizing them.  The library, on the other
>> hand, assumes that the refnames are normalized.  Sometimes (mostly in
>> the case of loose references) unnormalized refnames happen to work,
>> but in other cases (like packed references or when looking up refnames
>> in the cache) they silently fail.  Given that refnames are sometimes
>> treated as path names, there is a chance that some security-relevant
>> bugs are lurking in this area, if not in git proper then in scripts
>> that interact with git.
> 
> Why can't the library do the normalization instead of expecting every
> other component that deals with reference names having to do it for the
> library?

The library could do the normalization, but

1. It would probably cost a lot of redundant checks as reference names
pass in and out of the library and back in again

2. Normalization requires copying or overwriting the incoming string, so
each time a refname crosses the library perimeter there might have to be
an extra memory allocation with the associated headaches of dealing with
the ownership of the memory.

3. The library doesn't encapsulate all uses of reference names; for
example, for_each_ref() invokes a callback function with the refname as
an argument.  The callback function is free to do a strcmp() of the
refname (normalized by the library) with some arbitrary string that it
got from the command line.  Either the caller has to do the
normalization itself (i.e., outside of the library) or the library has
to learn how to do every possible filtering operation with refnames.

>> * Forbid ".lock" at the end of any refname component, as directories
>>    with such names can conflict with attempts to create lock files for
>>    other refnames.
> 
> I find this overly restrictive. If you need to create a lock based on a
> reference name or component, use a name for the lock object that starts
> with one of the characters that reference names or components are
> already forbidden from starting with.

I agree; this is unpleasantly restrictive.

But please remember that refnames already cannot end in ".lock"
("foo/bar.lock" is already forbidden; this change also prohibits
"foo.lock/bar").

However, your suggested solution would cause problems if two versions of
git are running on the same machine.  An old version of git would not
know to respect the new version's lock files.  ISTM that this would be
too dangerous.  Suggestions welcome.

Michael

-- 
Michael Haggerty
mhagger@xxxxxxxxxxxx
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]