On 04/29/2012 01:58 PM, Jeff King wrote:
On Sun, Apr 29, 2012 at 08:18:08AM +0200, mhagger@xxxxxxxxxxxx wrote:
I will work on providing more infrastructure for checking refnames at
varying levels of strictness, but I don't know enough about the code
paths to be able to find the places where the strictness levels need
tweaking.
For this to work, the various callers of check_refname_format() will
have to be able to influence the level of strictness that they want to
enforce. This patch is one trivial step in that direction.
It seems like the create_ref_entry code paths should _always_ just be
issuing warnings, as they are about reading existing refs, no? The die()
side should only come when we are writing refs.
There are not two but three classes of refnames:
1. Fully legal refnames (according to the rules of check-ref-format).
2. Refnames that might cause parsing trouble in some circumstances but
could otherwise be tolerated (with a warning) in the hope that the user
can delete them before they cause further confusion. These include all
illegal refnames that aren't covered by (3).
3. Refnames that are so pathological that we don't want to let them into
our code at all, under any circumstances. This category, I think,
includes refnames that include "/../" (because they could cause our code
to walk up the filesystem) or LF (because if written to a packed-refs
file they would make it unreadable) and perhaps "//" (because they would
confuse the loose reference code and the hierarchical reference cache
code). And depending on how much you trust shell scripts to do quoting
correctly, other patterns might also be risky.
I would like to be able to distinguish between (2) and (3) before
deciding what to do in any specific cases. References in category (2)
can probably be accepted (with a warning) in old data but we should not
allow the user to create new ones. References in category (3) are more
critical; I see three options for dealing with them: die(), emit a
warning then drop them on the floor, or emit a warning but handle them
somehow (for example, by URL-encoding them).
Treating category (3) the same as category (2) was more or less the
status quo before the changes, but I think it is dangerous, especially
when dealing with references that might come from remote sources (do you
disagree?).
Regarding create_ref_entry(), there are three callers:
* read_packed_refs(): Only used to read references from local
packed-refs files. Seems uncritical in terms of security, so for now
I suppose we could change this caller to emit a warning but use the
reference anyway. (It would not be a good idea to emit a warning but
*not* use the reference, because the next time the packed-refs file
is rewritten the reference would be lost.)
* get_ref_dir(): Only used to read loose references from the local
filesystem. Seems uncritical in terms of security, so for now
I suppose we could change this caller to emit a warning but use the
reference anyway. Alternately, we could emit a warning but not use
the reference; this would not result in any data loss because nobody
would have a reason to remove the loose reference file.
* add_packed_ref(): Used by write_remote_refs() to insert references
from a cloned repository into the local packed-refs file. Here I
think we have to be paranoid about accepting refnames in category
(3). For example, it might be reasonable to emit a warning
mentioning the illegal reference name *and the SHA1 that it referred
to* then to drop it on the floor.
Michael
--
Michael Haggerty
mhagger@xxxxxxxxxxxx
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html