The example provided didn't work for me. It gave me the same string without anything modified. I am also looking for this solution to strip out text from some XML response I get from posting data to a remote server. I can do it using substring functions but I'd like something more compact and portable. (A one-liner that I could modify for other uses as well) Example 1: <someXMLtags> <status>16664 Rejected: Invalid LTV</status> </someXMLtags> Example 2: <someXMLtags> <status>Unable to Post, Invalid Information</status> </someXMLtags> I want what is inside the <status> tags. Does anyone have a working solution how we can get the text from inside these tags using regex? Much appreciated, B > -----Original Message----- > From: Michael [mailto:michael@xxxxxxxxxxxxxx] > Sent: Monday, December 11, 2006 6:59 AM > To: Anthony Papillion > Cc: php-general@xxxxxxxxxxxxx > Subject: Re: Need help with RegEx > > At 01:02 AM 12/11/2006 , Anthony Papillion wrote: > >Hello Everyone, > > > >I am having a bit of problems wrapping my head around regular > expressions. I > >thought I had a good grip on them but, for some reason, the expression > I've > >created below simply doesn't work! Basically, I need to retreive all of > the > >text between two unique and specific tags but I don't need the tag text. > So > >let's say that the tag is > > > ><tag lang='ttt'>THIS IS A TEST</tag> > > > >I would need to retreive THIS IS A TEST only and nothing else. > > > >Now, a bit more information: I am using cURL to retreive the entire > contents > >of a webpage into a variable. I am then trying to perform the following > >regular expression on the retreived text: > > > >$trans_text = preg_match("\/<div id=result_box dir=ltr>(.+?)<\/div>/"); > > Using the tags you describe here, and assuming the source html is in the > variable $source_html, try this: > > $trans_text = preg_replace("/(.*?)(<div id=result_box > dir=ltr>)(.*?)(<\/div>)(.*?)^/s","$3",$source_html); > > how this breaks down is: > > opening quote for first parameter (your MATCH pattern). > > open regex match pattern= / > > first atom (.*?) = any or no leading text before <div id=result_box > dir=ltr>, > the ? makes it non-greedy so that it stops after finding the first match. > > second atom (<div id=result_box dir=ltr>) = the opening tag you are > looking for. > > third atom (.*?) = the text you want to strip out, all text even if > nothing is > there, between the 2nd and > 4th atoms. > > fourth atom (<\/div>) = the closing tag of the div tag pair. > > fifth atom (.*?) = all of the rest of the source html after the closing > tag up > to the end of the line ^,even if there is nothing there. > > close regex match pattern= /s > > in order for this to work on html that may contain newlines, you must > specify > that the . can represent newline characters, this is done by adding the > letter > 's' after your regex closing /, so the last thing in your regex match > pattern > would be /s. > > end of string ^ (this matches the end of the string you are > matching/replacing > , $source_html) > > closing quote for first parameter. > > The second parameter of the preg_replace is the atom # which contains the > text > you want to replace the text matched by the regex match pattern in the > first > parameter, in this case the text we want is in the third atom so this > parameter > would be $3 (this is the PHP way of back-referencing, if we wanted the > text > before the tag we would use atom 1, or $1, if we want the tag itself we > use $2, > etc basically a $ followed by the atom # that holds what we want to > replace the > $source_html into $trans_text). > > The third parameter of the preg_replace is the source you wish to match > and > replace from, in this case your source html in $source_html. > > after this executes, $trans_text should contain the innerText of the <div > id=result_box dir=ltr></div> tag pair from $source_html, if there is > nothing > between the opening and closing tags, $trans_text will == "", if there is > only > a newline between the tags, $trans_text will == "\n". IMPORTANT: if the > text > between the tags contains a newline, $trans_text will also contain that > newline > character because we told . to match newlines. > > I am no regex expert by far, but this worked for me (assuming I copied it > correctly here heh) > There are doubtless many other ways to do this, and I am sure others on > the > list here will correct me if my way is wrong or inefficient. > > I hope this works for you and that I haven't horribly embarassed myself > here. > Good luck :) > > > > >The problem is that when I echo the value of $trans_text variable, I end > up > >with the entire HTML of the page. > > > >Can anyone clue me in to what I am doing wrong? > > > >Thanks, > >Anthony > > > >-- > >PHP General Mailing List (http://www.php.net/) > >To unsubscribe, visit: http://www.php.net/unsub.php > > -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php