I am trying to get the text between nested html tags within arrays
produced by preg_match_all. The simple situation would be:
<tr><td>test</td></tr>
I want to return 'test'.
Assuming $post_results has some string derived from a html page source
with lots of nested tags.
Replacing new line (seems to be a good idea to do this):
$post_results = ereg_replace("[\n\r]", " ", $post_results);
I tried this:
$pattern = "/<[^>]*>(.+)<\/[^>]*>/i";
Explanation (as far as I understand, please feel free to correct):
/.../ start end
< - opening html tag
[^>] end tag
* any number of same
end tag - don't understand why needed in addition to above
(.+) group: any number of any characters
< opening tag
\/ literal forward slash
[^>] end with tag end
* any number of same
end tag - don't know why needed again
i - modifier, can't remember what it means, something like case
insensitive, yes, that would be it
//Main expression for first try, substituting tags:
preg_match_all($pattern,$post_results,$outputs);
//this only replaces the outer tag eg <tr>, not the <td>, so:
while(stristr($outputs[0][1],"<")) {
preg_match_all($pattern,$outputs[0][1],$outputs,PREG_PATTERN_ORDER);
}
Is there a neat expression to get the inner text withing nested html tags?
Thanks,
John
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php