Re: URLs with backslash character using Firefox breaks Digest-Auth

Jan Sievers <sievers@xxxxxxxxxxxxxxxxxx> · Thu, 06 Oct 2011 19:35:26 +0200

Hi Amos,

On 2011-10-03 06:50, Amos Jeffries wrote:
> On 27/09/11 06:19, Jan Sievers wrote:
>> I noticed that trying to access a resource using URLs with backslash
>> character ('\') using Firefox 6.0.2 breaks the HTTP-Digest
>> authentication in Squid 3.1.15.
>>
>> Accessing the same resource using explicitly URL-escapes in the address
>> bar (%5C) works.
>>
>> Accessing the same resource with Opera works, since it escapes every
>> ('\') character itself.
>>
>> I guess it's more a Firefox bug, if it's not a content provider bug :-)
>> ignoring RFCs. But I am wondering if I can do something about it in
>> Squid?
>>
>> Or if Squid could respect different URL handling of different clients
>> and build the digest hash the same way the browser does, meaning not to
>> manipulate the uri provided in the Proxy-Authorization header again;
>> here not to remove the backslash character?
> 
> This sounds a lot like http://bugs.squid-cache.org/show_bug.cgi?id=3077

Well, actually it's not exactly the same bug.

In #3077 it was reported, that Squid ate more from a HTTP header because
a quoted-string ended (legally) with a backslash character ('\'), like

	uri = "/?\"

This is not really possible to parse, if you want to allow escaped
quoted-pairs, like

	realm = "\"myserver\""

But here I have the situation, that a backslash character is within a
quoted-string and not at the end, like

	uri = "/Default.aspx?path=foo\bar"

With Squid 3.1.15 and "debug_options 29,9" this will result in

	authDigestDecodeAuth: Found uri '/Default.aspx?path=foobar'

And therefor it will calculate a different digest hash than the client.

Mentioned commit #r10998 does not fix it for me.

I did not manage to build Squid 3.2.0.12 here, but I don't see why it
should not do the same, since HttpHeaderTools.cc:370 in function
httpHeaderParseQuotedString says

        bool quoted = (*pos == '\\');
        if (quoted) {
            pos++;

and later it never ever decreases "pos", so it jumps over every
backslash character and never re-includes it in the string.

Or am I missing something?

Maybe something like that should do the job:
In line 380 instead of

	while (end < (start+len) && *end != '\\' && *end != '\"' && *end > 0x1F
&& *end != 0x7F)
		end++;

use

        while (end < (start+len) && *end > 0x1F && *end != 0x7F) {
            if (quoted) {
                quoted = false;
                if (*end != '\\' && *end != '\"')
                    pos--;
            } else {
                if (*end == '\\' || *end == '\"')
                    break;
            }
            end++;
        }

Beside that I am still unsure, if I could blame the content provider and
probably also the client for using unescaped backslash characters in
URIs (links in case of the content provider) and in a HTTP-Digest
authentication header which both should respect RFC 2396 "URI Generic
Syntax" which in my view does not allow such characters (e.g. in the
query part)

	query         = *uric
	uric          = reserved | unreserved | escaped
	reserved      = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
                        "$" | ","
	unreserved    = alphanum | mark
	mark          = "-" | "_" | "." | "!" | "~" | "*" | "'" |
                        "(" | ")"
	escaped       = "%" hex hex

Nevertheless it should be fixed it in "httpHeaderParseQuotedString"
function, since the (unescaped) backslash character is unfortunately
allowed in HTTP-Headers in general, see RFC 2616 "HTTP/1.1".

Comments appreciated.

Jan

-- 
Jan Sievers                              |
Freie Universität Berlin                 | sievers@xxxxxxxxxxxxxxxxxx
Zentraleinrichtung für Datenverarbeitung | http://www.zedat.fu-berlin.de