On 01/01/2017 06:59 AM, Jeff King wrote: > On Sat, Dec 31, 2016 at 06:30:01PM -0800, Junio C Hamano wrote: > >> Michael Haggerty <mhagger@xxxxxxxxxxxx> writes: >> >>> "refname" has already been checked by check_refname_format(), so it >>> cannot have consecutive slashes. >> >> In the endgame state, this has two callers. Both use what came in >> the transaction->updates[] array. Presumably "has already been >> checked by check_refname_format()" says that whoever created entries >> in that array must have called the function, but it would be helpful >> to be more explicit here. > > Hmm, yeah. This is called when we are deleting a ref, and I thought we > explicitly _didn't_ do check_refname_format() when deleting, so that > funny-named refs could be deleted. It's only is_refname_safe() that we > must pass. > > So I have no idea if that's a problem in the code or not, but it is at > least not immediately obvious who is responsible for calling > check_refname_format() here. My assumption was that only valid reference names should ever be allowed to be inserted into a `ref_transaction` entry. But Peff is right that sometimes the reference name is checked by `refname_is_safe()` rather than `check_refname_format()`. Let's audit this more carefully... * `ref_transaction_add_update()` relies on its caller doing the check (this fact is documented). Its callers are: * `ref_transaction_update()` (the usual codepath), which checks the reference itself using either check_refname_format() or refname_is_safe() depending on what kind of update it is. * `split_head_update()` passes the literal string "HEAD". * `split_symref_update()` picks apart reference updates that go through existing symbolic references. As such I don't think they are an attack surface. It doesn't do any checking itself (here we're talking about its `referent` argument). It has only one caller: * `lock_ref_for_update()`, which gets `referent` from: * `files_read_raw_ref()`, which gets the value either: * by reading a filesystem-level symlink's contents and checking it with `check_refname_format()`, or * reading a symref from the filesystem. In this case, *the value is not checked*. Obviously this chain of custody is disconcertingly long and complicated. And the gap for symrefs should probably be fixed, even though they are hopefully trustworthy. `refname_is_safe()` checks that its argument is either UPPER_CASE or looks like a plausible filename under "refs/". Contrary to its docstring, it *does not* accept strings that contain successive slashes or "." or ".." components. It was made stricter in e40f355 "refname_is_safe(): insist that the refname already be normalized", 2016-04-27 , but the docstring wasn't updated at that time. I'll fix it. I think the best thing to do is to drop this patch from the patch series, and address these checks in a separate series. (There are more known problems in this area, for example that the checks in `check_refname_format()` are not a strict superset of the checks in `refname_is_safe()`.) Michael