Re: SQUID store_url_rewrite

Ghassan Gharabli <sounarose@xxxxxxxxxxxxxx> · Tue, 31 May 2011 20:47:13 +0300

Im sorry again for the last email but I also have something to ask for ..

(m/^http:\/\/(.*?)(\.[^\.\-]*?\..*?)\/([^\?\&\=]*)\.([\w\d]{2,4})\??.*$/)

now Im talking about this element ([\w\d]{2,4}) which seems to match
.ex , .ext or .exte for example .mp3

I understand that \w matches an alphanumeric character, including "_"
same as [A-Za-z0-9_] in ASCII

that I know it finds for numbers , letters including underscore ..
which is correct here but the thing that is confusing ot me
also we have used \d which finds for matches a digit same as [0-9] in
ASCII.. so we have used 0-9 twice! any comment about it?

Im also seeing these urls again

#generic http://variable.domain.com/path/filename."ex";, "ext" or "exte"
#http://cdn1-28.projectplaylist.com
#http://s1sdlod041.bcst.cdn.s1s.yimg.com

^ means that we matches the beginning of a line or string.
m/^http:\/\/ ... we used at the start (.*?) which seems to be to find anything !

If we want to look at this url ; #http://s1sdlod041.bcst.cdn.s1s.yimg.com

If Im correct then (.*?) means to match "s1sdlod041" and then the
second element(\.[^\.\-]*?\..*?) we moved to . after
"s1sdlod041" so nw we have "http://s1sdlod041."; but I want to know how
about "[^\.\-]*?\..*?" like [] or we used ^ for \. and \-
coz we are also finding dashes or dots .. after that we used "*"
anything! and then Question Mark "?" .. something also confusing to me
"\.." or "\..*?" .

another question to ask for ([^\?\&\=]*) umm I think this one is for
folders or what ?...

as I saw the slash \/ before it .. which seems to catch
/?url=blah&C=blah2 and the "*" matches "blah" and "bla2"

but please if you dont mind then you can explain or illustrate more
about (\.[^\.\-]*?\..*?) or maybe you can explain it well

using your way as Im sure you are a good teacher hehehe

Please explain the whole match to me
(m/^http:\/\/(.*?)(\.[^\.\-]*?\..*?)\/([^\?\&\=]*)\.([\w\d]{2,4})\??.*$/)

I was eager to ask you all these questions from the start but I was
afraid thinking you'll not help anyway

that what I was trying to go so far is FileHippo domain

http://fs34.filehippo.com/6574/058e5771e07c467cb38d70ab6fbed3c0/Opera_1150b1_int_Setup.exe

in this case we have to try to change the domain into
"cdn.filehippo.com/6574/Opera_1150b1_int_Setup.exe" because we removed
the hashed folder!

Its okay I have the script for it

			#cdn, varialble 1st path
} elsif (($u =~ /filehippo/) &&
(m/^http:\/\/(.*?)\.(.*?)\/(.*?)\/(.*)\.([a-z0-9]{3,4})(\?.*)?/)) {
	@y = ($1,$2,$4,$5);
	$y[0] =~ s/[a-z0-9]{2,5}/cdn./;
	print $x . "http://"; . $y[0] . $y[1] . "/" . $y[2] . "." . $y[3] . "\n";

and its working 100% . I can get it from cache too .. what if I want
to add wlxrs.com into ($u =~ /filehippo|wlxrs/)

does that match this URL?
http://css.wlxrs.com/HGjlAVvMlW6-1!iEEpuBkgo2TZKpU8RH!W4mH-UPgteZ8OD6Oxte!sCQWfQ1OB7A6B-NZoBS1jrItq7zq!v10A/OOB_30_IllustratedKai/15.40.1211/img/Kai_Sunny_thumbnail.jpg
I dont think so as it has "!" where should I add this one to match a
folder like
"/HGjlAVvMlW6-1!iEEpuBkgo2TZKpU8RH!W4mH-UPgteZ8OD6Oxte!sCQWfQ1OB7A6B-NZoBS1jrItq7zq!v10A/"

sometimes the CDN folder comes at the 1st folder or 2nd or 3rd ..
deopends on any website.

can you lead me where should I find or edit this script to follow WLXRS.COM

btw, you really helped alot with those complicated examples which
means I can start from now to match any known cases

Thank you alot