Re: mod_rewrite: PATH_INFO gets injected with each Rule

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 
---------- Forwarded message ----------
From: Rich Bowen <rbowen@xxxxxxxxxxx>
To: users@xxxxxxxxxxxxxxxx
Date: Tue, 22 Apr 2008 10:02:04 -0400
Subject: Re: mod_rewrite: PATH_INFO gets injected with each Rule

 
On Apr 21, 2008, at 08:54, Aleksander Budzynowski wrote:


Hi,
 
The behaviour I'm seeing resemebles the bug described here: http://archive.apache.org/gnats/7879 Reportedly it was fixed in 2.0.30. However, testing under both 2.2.3 and 2.0.61 I get the same sort of problem.
 
Essentially, PATH_INFO is appended to the end of the URI before each RewriteRule is processed. If more than one RewriteRule match, you can end up with redundant garbage at the end of the URI.
 
Let's consider a rule designed to turn all underscores into hyphens (done in a per-directory context, i.e. .htaccess file):
 
RewriteEngine On
#Convert _ to - (N flag ensures that all underscores get converted)
RewriteRule ^(.*)_(.*) $1-$2 [N]
 
It seems innocent enough. But issue a request for
 
/_f_o_o_/bar
 
(where _f_o_o_ does not exist, placing '/bar' in PATH_INFO), and this gets rewritten to /-f-o-o-/bar/bar/bar/bar!
 
If you request /foo/_bar (assuming foo does not exist), then each new _bar will feed an extra underscore back into the mix, creating an infinite loop - even worse.
 
 
In the RewriteLog, one sees something like this before the application of each RewriteRule:
 
add path-info postfix: /rewritebase/_f_o_o_ -> /rewritebase/_f_o_o_/bar
 
although each time it accumulates an extra '/bar'.
 
 
This doesn't look right to me. Is it a bug? Or have I missed something obvious?
 

This does look pretty nasty. Can you try 1) testing with the latest versions, and 2) posting your RewriteLog so that we can see what process it's going through to do this? Given that that's an example from the documentation, one kind of hopes that it'll work correctly.




Also, I'm trying this out myself. Is it only on PATH_INFO, or is it also on existing file names?


--Rich

It's only PATH_INFO, and only within .htaccess. Looking at the 2.2.8 source (mod_rewrite.c:3694), this seems to be the culprit:

        if (r->path_info && *r->path_info) {
            rewritelog((r, 3, ctx->perdir, "add path info postfix: %s -> %s%s",
                        ctx->uri, ctx->uri, r->path_info));
            ctx->uri = apr_pstrcat(r->pool, ctx->uri, r->path_info, NULL);
        }
 
It looks like nowhere in the rewriting process is r->path_info modified, meaning that this happens for EVERY RewriteRule. And this becomes a problem if more than one RewriteRule matches.
 
Back at line 3680, we have this:
    ctx->uri = r->filename;
 
Before any of the RewriteRules match, this will be the URI minus PATH_INFO. But once a rule matches, the path is changed. PATH_INFO basically becomes invalid!

Is PATH_INFO recalculated after a URI is run through mod_rewrite? (If so then it would make perfect sense to empty r->path_info whenever a RewriteRule matches.) If not, should it be? Maybe only in conjunction with the [PT] flag?
 
If we can't, for whatever reason, disturb path_info, then we could add a "matched" member to rewrite_ctx, to indicate that a substitution has already been made, and not append PATH_INFO if this has occurred.
 
I have a feeling that this is a bug which went unnoticed because people simply blamed it on the quirks of mod_rewrite.
 
-Aleks
 

[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux