Re: preg_match_all to match <img> tags

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, August 9, 2007 6:45 pm, Ólafur Waage wrote:
> I know this isn't exactly a php related question but due to the
> quality of answers ive seen lately ill give this a shot. (yes yes im
> smoothing up the crowd before the question)
>
> I have a weblog system that i am creating, the trouble is that if a
> user links to an external image larger than 500pixels in width, it
> messes with the whole layout.
>
> I had found some regex code im using atm but its not good at matching
> the entire image tag. It seems to ignore properties after the src
> declaration and not match tags that have properties before the src
> declaration .
>
> preg_match_all("/\< *[img][^\>]*[src] *= *[\"\']{0,1}([^\"\'\ >]*)/i",
> $data, $matches);
> print_r($matches);
>
> This currently makes two arrays for me, the source location from all
> img tags and a large part of the tag itself. But not the entire tag.
>
> What i do is i match the img tag, find the src, get the image
> properties, and if the width is more than 500, i shrink it down and
> add width="X" and height="Y" properties to the image tag.
>
> How can i match an image tag correctly so it does not cause any issues
> with how the user adds the image.

Scaling the image in the browser is horrible for performance on the
client side...

The entire image still gets downloaded, and then the poor browser has
to scale this monster image down.

You may want to re-think your plan of attack...

You could, for example, force users to only use "registered" images,
and if they "register" an image large than 500, use http://php.net/gd
to scale it down.

As far as matching the image tag correctly goes, I'd have to suggest
that you try using a DOM to parse the HTML instead of regex.

That said, to get the WHOLE img tag in the same vein as you are using
now:
preg_match_all("/\< *[img][^\>]*[src] *= *[\"\']{0,1}([^\"\'\ >]*>)/i",

Note the addition of a closing ">" which will mark the end of the img
tag.

The [img] and [src] are kind of wonky, really, as they would also
match this bit of nonsense:

Sometimes <i src="foo">italics</i> could have bogus attributes.

YMMV

-- 
Some people have a "gift" link here.
Know what I want?
I want you to buy a CD from some indie artist.
http://cdbaby.com/browse/from/lynch
Yeah, I get a buck. So?

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux