Dave M G wrote: > Jochem, > > Thank you for responding. > >> >> does this one work?: >> preg_replace('#^<\!DOCTYPE(.*)<ul[^>]*>#is', '', $htmlPage); > > Yes, that works. I don't think I would have every figured that out on my > own - it's certainly much more complicated than the ereg equivalent. 1. the '^' at the start of the regexp states 'must match start of the string (or line in multiline mode)' 2. the 'i' after the the closing regexp delimiter states 'match case-insensitively' 3. the 's' after the the closing regexp delimiter states 'the dot also matches newlines' 4. the '<u[^>]*>' matches a UL tag with any number of attributes ... the '[^>]*' matches a number of characters that are not a '>' character - the square brackets denote a character class (in this cass with just one character in it) and the '^' at the start of the character class definition negates the class (i.e. turns the character class definition to mean every character *not* defined in the class) PCRE is alot more powerful [imho], the downside it it has more modifiers and syntax to control the meaning of the patterns... read and become familiar with these 2 pages: http://php.net/manual/en/reference.pcre.pattern.modifiers.php http://php.net/manual/en/reference.pcre.pattern.syntax.php and remember that writing patterns is often quite a complex - when you build one just take i one 'assertion' at a time, ie. build the pattern up step by step... if you give it a good go and get stuck, then there is always the list. > > If I may push for just one more example of how to properly use regular > expressions with preg: > > It occurs to me that I've been assuming that with regular expressions I > could only remove or change specified text. essentially regexps are pattern syntax for asserting where something matches a pattern (or not) - there are various functions that allow you to act upon the results of the pattern matching depending on your needs (see below) > > What if I wanted to get rid of everything *other* than the specified text? > > Can I form an expression that would take $htmlPage and delete everything > *except* text that is between a <li> tag and a <br> tag? yes but you wouldn't use preg_replace() but rather preg_match() or preg_match_all() which gives you back an array (via 3rd/4th[?] reference argument) which contains the texts that matched (and therefore want to keep). > > Or is that something that requires much more than a single use of > preg_replace? > > -- > Dave M G > -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php