Re: Parsing HTML href-Attribute

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Edmund Hertle wrote:
>> * http://www.google.com/search?q=php ... absolute path (yes, it's a URL,
>> but treat it as absolute)
>> * https://www.example.com/index.php ... absolute path (yes, it's a URL,
>> but to the local server)
>> * /index.php ... absolute path (no protocol given, true absolute path)
>> * index.php ... relative path (relative to current directory on current
>> server)
>> * somefolder/index.php ... relative path (same reason)
>>
>> That is indeed a nifty use of look-ahead, though. That will work for any
>> anchor tag that doesn't reference the server (or any other server) with a
>> protocol spec preceding it. However, if you want to run it through an entire
>> list of anchor tags with any spec (http://, https://, udp://, ftp://,
>> aim://, rss://, etc.)--or lack of spec--and only mess with those that don't
>> have a spec and don't use absolute paths, it needs to get a bit more
>> complex. You've convinced me, however, that it can be done entirely with one
>> regex pattern.
>>
>> // Todd
> 
> 
> Hey!
> Wow, I think that was exactly what I was looking for... thank all of you...
> although I've not tested it, will do that tomorrow, but sounds very nice
> 
> But Todd just confused me quite a bit with the statement: Is /index.php a
> case where the RegEx will fail?
> 
> To add some background: It is about dynamiclly creating pdf files out of
> html source code and then the links should also work in the pdf file. So
> other protocolls then http:// shouldn't be a problem
> 
> -eddy
> 
That regex should work on all hrefs. index.php and /index.php will be
replaced with http://www.example.com/index.php and somedir/index.php and
/somedir/index.php will be replaced with
http://www.example.com/somedir/index.php.  Any URL starting with http://
or https:// will be ignored.

Again, I say that it won't work on URLs with spaces, like "my web
page.html".  When I get a minute I'll fix it.  I thought spaces in URLs
weren't valid markup, but it seems to validate.

-- 
Thanks!
-Shawn
http://www.spidean.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux