RE: Regex Help for URL's [ANSWER]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 16, 2006 6:21 pm, Robert Cummings wrote:
> On Tue, 2006-05-16 at 18:49, Robert Samuel White wrote:
>> In case any one is looking for a solution to a similar problem as
>> me, here
> preg_match_all("#(\"|')http://(.*)(\"|')#U", $content, $matches);

And it's missing the original requirement of matching https URLs, so
maybe make it be ...https?://...

Plus, http could be IN CAPS, so change the U to iU

And, actually, SOME old-school HTML pages will have neither ' nor "
around the URL, and are (or were) valid:
href=page2.html
was considered valid for HTML for a long long long time
So toss in (\"|')?
And then you may be finding URLs that are not actually linked but are
part of the "visible" content, so maybe you only want the ones that
have
<a[^>]href=
in front of them.

If I can toss off 3 problems without even trying...

So I still think Google or searching the archives (as I suggested
off-list) will be the quickest route to a CORRECT answer, but here we
are again in this same thread we've been in every month or so for the
better part of a decade...

PS the (\"|') bit may "move" the URLs into $matches[2] instead of
$matches[1] or whatever.

-- 
Like Music?
http://l-i-e.com/artists.htm

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux