On 05/01/07, Richard Lynch <ceo@xxxxxxxxx> wrote:
On Wed, January 3, 2007 2:41 pm, Dotan Cohen wrote: > On 03/01/07, Richard Lynch <ceo@xxxxxxxxx> wrote: >> Instead of trying to strip the UTF stuff out, try to capture the >> part >> you want: >> >> preg_match_all('|<[^>]>|ms', $emails, $output); >> var_dump($output); >> > > Richard, I do have a working script now, but I'm intrigued by your > regex. Why do you surround the needle with pipes, and what is the "ms" > for? The start/end character can be almost anything you want, and which is convenient. If the "pattern" you are looking for has a '|' in it, then '|' would be very inconvenient, as you'd have to escape it. But if it has no '|' in the pattern, '|' is convenient. It's traditional to use '/' but because / is already used in pathnames and HTML tags, I find myself using '|' more often, as I seldom have patterns with '|' in them as a meaningful character that I need to type. You can also (in some versions) use "matching" start/end delimiters, like < with > or { and } and so on. In this particular case, almost anything except < and > would be convenient, so I could have chosen any of these: |(<[^>]*>)| /(<[^>]*>)/ {(<[^>]*>)} [aside] Notice how I subtly corrected my obvious mistakes this time around... :-) [/aside] The 'm' tacked on at the end allow for newline within the pattern and content, so that if your emails are separated by newlines, it should still work. Actually, I think the 'm' might not be needed, as there should be any newlines WITHIN the pattern. The 's' allows the '.' (if I had one, which I don't) to match newlines within the string as well as other characters. It is totally pointless to have included 's' in this case, since I have no '.' in the pattern in the first place. Just habit, I guess. I generally find that if I have a big ol' chunk of text, and I want to do PCRE on it, and it might have newlines, I want 'ms' on the end, and I don't want that if it's just a single line of text. I'm still definitely more in the Cargo Cult, perhaps graduating to Voodoo Programming style, of PCRE pattern composing. Maybe someday I'll *really* understand regex, and graduate to Competent. I doubt it though.
Thanks. This is getting filed under my regex-emergencies label. I'll definetly be referencing this again. As usual, I do prefer to be taught to capture fish rather than be handed a fish. Dotan Cohen http://lyricslist.com/lyrics/artist_albums/517/yaz.html http://what-is-what.com/what_is/sitepoint.html -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php