RE: how does this regular expression works

"Shahriyar Imanov" <shehi@xxxxxxxxxxx> · Wed, 13 Jul 2011 16:04:39 +0300

That Regex actually has an unnecessary part to it - it would be better to 
write it like:

/\<head.*?\>(?P<head_tag_innerHTML>.*?)\<\/head\>/sim

I added /m modifier, telling it that the search is performed in Multiline 
basis. Added "head_tag_innerHTML" named reference for you, so writing this 
Regex in PHP like this:

preg_match(   '/\<head.*?\>(?P<head_tag_innerHTML>.*?)\<\/head\>/sim'   , 
$subject  ,  $matches   )

will place search results inside $matches and you can access your captured 
content, i.e. HEAD's value in $matches['head_tag_innerHTML'].

I also replaced your delimiters for better clearance...

It will fetch EVERYTHING that comes inside HEAD tag, i.e. its value/innerHTML. 
Putting ? after * makes it to perform *ungreedy* scan. And escaping all 
possible Regex characters [including <, >, ? etc] is a good practice.

Shehi

-----Original Message-----
From: who.cat@xxxxxxxxx [mailto:who.cat@xxxxxxxxx] On Behalf Of who.cat
Sent: 13 iyul 2011 15:43
To: php-db@xxxxxxxxxxxxx
Subject:  how does this regular expression works

Reading some code from the web spider,got the such a expression :
preg_match("@<head[^>]*>(.*?)<\/head>@si",$file, $regs); It's about get the 
head content of a website,i wanna know the match detail .Could someone give 
some tips , thanks in advance .

All you best
------------------------
What we are struggling for ?
The life or the life ?
<<attachment: smime.p7s>>