On Sat, 2006-08-05 at 10:50 +0900, Dave M G wrote: > Jochem > > Thank you for your continued assistance. > > > ^--- remove the caret as you dont want to only match when the line > > starts with <li> (the <li> can be anywhere on the line) > > > Ah, I get it now. I was confused about the meaning of the caret. > > > I'll assume you also have the mb extension setup. > > > Yes, I do. > > This regular expression is tricky stuff, and its behaviour is not what > I'd expect. > > After much experimentation, I discovered that I needed to take the last > "s" out of my syntax. This was the "s" that states that the search could > span across line breaks. > > I assumed that the behaviour would be to start at one instance of <li> > and continue until the first instance of <br> and extract that as a > variable. And then start again at the next instance of <li> and so on. > > But instead it seems to be starting from the extreme outside and work > it's way inwards from both ends, thus trapping all text between the very > first <li> in the source string, and the very last <br> in the source. > > So if the "s" option is on to span across lines, then it gets only one > match for the whole HTML document, containing everything between the > very first <li> and the very last <br>. If I take off the "s" option, > then it only looks at <li> and <br> tags within each line, thus > returning small, discreet matches. Check out the greediness modifier. Greediness determines whether it extends the matching to the largest possible match or the smallest possible match. By default regexes are greedy. > I personally don't think this is very rational behaviour, so either I'm > doing something wrong still, or perhaps it's me who isn't very rational. > Either is likely. It's perfectly valid since it is correctly matching the pattern, just an issue of how greed ;) Cheers, Rob. -- .------------------------------------------------------------. | InterJinn Application Framework - http://www.interjinn.com | :------------------------------------------------------------: | An application and templating framework for PHP. Boasting | | a powerful, scalable system for accessing system services | | such as forms, properties, sessions, and caches. InterJinn | | also provides an extremely flexible architecture for | | creating re-usable components quickly and easily. | `------------------------------------------------------------' -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php