On Sat, 8 Dec 2007 15:40:09 +0100 Torsten Foertsch <torsten.foertsch@xxxxxxx> wrote: > On Sat 08 Dec 2007, Christian Lerrahn wrote: > > > RewriteEngine On > > > RewriteRule (.*)//+(.*) $1$2 [R=permanent,L] > > > > Thanks for that. I'm sorry to still bother. I'd like to get rid of > > paths like //foo/bar, too, which do not match with this rule. To be > > honest I don't quite understand the rule. That's probably the reason > > why I can't modify if correctly to match to //foo/bar as well. When > > I saw the regexp, I thought that I would end up without any slashes > > but obviously I'm not. Wouldn't matching /foo//bar/ match as > > $1=/foo and $2=bar/ ? Why does it not match like that? Then also it > > seems to me that (.*) should also match an empty string which would > > mean that leading slashes would get stripped, too. Why does that > > not happen? > > You need to know that * in regexes is greedy. That means it eats up > as many characters as it could to match the regexp. So in /foo///bar > $1 gets /foo/ and not only /foo. > > What you need for $1 is a nongreedy one (*? instead of *), something > like this: > > RewriteRule (.*?)//+(.*) $1/$2 ... > > You can try this in a little Perl-onliner: > > perl -ne 'BEGIN {$|=1; print "> "} if(m!(.*?)//+(.*)!) {print > "$1\t$2\n"} else {print "no match\n"} print "> "' > > It offers you a "> " prompt to enter a string that is matched against > that regexp. Then $1 and $2 are printed delimited by a tab-character. > > You'll see that the new regexp matches even at the beginning of the > line: > > > /foo/bar > no match > > /foo//bar > /foo bar > > /foo///bar > /foo bar > > ///foo///bar > foo///bar > > //foo//bar > foo//bar I realised that the matching was greedy and assumed that the question mark would serve the same purpose as in perl. However, ///foo/bar should still match even if the pattern is greedy. After all, there is no match to // between foo and bar. However, it does not match on // at the beginning. I actually was wrong in my last post. The rule RewriteRule (.*/)/+(.*) $1$2 [R=permanent,L] fixes almost all of my problems. The only problem that remains is that the pattern doesn't match at the beginning of the path. The weird thing is that a path like //foo//bar will get converted to /foo/bar in 2 redirection which are a match on the first // first (//foo//bar -> /foo//bar) and then a match on the later occurrence of // (i.e. /foo//bar -> /foo/bar). No, this does not make any sense to me. :( > The last 2 of the examples above reveal another problem with the > approach. The RewriteRule matches only the first occurrence and then > sends a redirect to the browser. If your URL contains multiple > occurrences of subsequent slashes you may hit the browser's redirect > limit. > > To overcome that you can try to loop in mod_rewrite (untested): > > RewriteRule (.*?)//+(.*) $1/$2 [E=R:$1/$2,N] > > RewriteCond %{ENV:R} . > RewriteRule . %{ENV:R} [R=permanent,L] This doesn't matter too much to me. URLs that have more than one place with too many slashes are rather rare. Therefore I'm ok with that resulting in more than one redirect. Cheers, Christian --------------------------------------------------------------------- The official User-To-User support forum of the Apache HTTP Server Project. See <URL:http://httpd.apache.org/userslist.html> for more info. To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx