At 11/18/2006 05:46 AM, Børge Holen wrote:
["desc"] = " <c> FFFFFF topic <c> 999999 rest of the text ",
$string = preg_replace("/<c>\s\w[0-9A-F]+/","",$string);
prints out: topic rest of the text (
with double spaces :(, I thought
\s would fix that )
however how would I go on this:
<font color="colorcode">topic</font>
<font color="colorcode">rest of thetext</font>
Børge,
Here's how I would think this one through:
First, I'm having to make several guesses at the nature of your text content:
- You use the single word "topic" but I'll assume
this can be multiple words and spaces.
- Your source string includes a space after "rest
of the text " while your marked-up result
doesn't. However I will assume that you really
do mean the rest of the text until end-of-string.
- Your source string also includes a space before
the initial <c> but your regexp pattern
doesn't. I'll assume that both beginning and ending spaces are unintentional.
Your source string:
"<c> FFFFFF topic <c> 999999 rest of the text"
consists of these parts:
1) [start-of-string]
2) "<c> "
3) "FFFFFF" (color code 1)
4) " "
5) "topic" (text 1)
6) " <c> "
7) "999999" (color code 2)
8) " "
9) "rest of the text" (text 2)
10) [end-of-string]
i.e.:
1) [start-of-string]
2) <c> + whitespace
3) color code 1
4) whitespace
5) one or more characters
6) whitespace + <c> + whitespace
7) color code 2
8) whitespace
9) one or more characters
10) [end-of-string]
This suggests the regexp pattern:
1) ^
2) <c>\s
3) ([0-9A-F]{6})
4) \s
5) (.+)
6) \s<c>\s
7) ([0-9A-F]{6})
8) \s
9) (.+)
10) $
/^<c>\s([0-9A-F]{6})\s(.+)\s<c>\s([0-9A-F]{6})\s(.+)$/i
Everything in the source string that you need to
retain needs to be in parentheses so regexp can grab it.
In 5) I can let the pattern be greedy, safe in
the knowledge that there WILL be a /s<c> to terminate the character-grab.
I end with the pattern modifier /i so it will
work with lowercase letters in the RGB color codes.
PHP:
$sText = '<c> FFFFFF topic <c> 999999 rest of the text';
$sPattern = '/^<c>\s([0-9A-F]{6})\s(.+)\s<c>\s([0-9A-F]{6})\s(.+)$/i';
preg_match($sPattern, $sText, $aMatches);
print_r($aMatches);
result:
Array
(
[0] => <c> FFFFFF topic <c> 999999 rest of the text
[1] => FFFFFF
[2] => topic
[3] => 999999
[4] => rest of the text
)
This isolates the four substrings you want in regexp references $1 through $4.
Replacement:
[Tangentially, I'd like to comment that font tags
are passe. I urge you to use spans with styling
instead. I normally dislike using inline styles
(style details mixed with the HTML), but in this
case (as far as I know) you don't have any
choice. If you can, I suggest you replace the
literal color codes with style names and define
the precise colors in your stylesheet, not your database.
[What this further suggests is that you ought to
have two discrete database fields, `topic` and
`description`, if you can, rather than combining
them into one field that needs to be
parsed. Then you can output something like:
<span class="topic">TOPIC</span> <span class="desc">DESCRIPTION</span>
and leave the RGB color codes out of this layer
of your application altogether.]
However, working with the data you've been dealt:
$sTagBegin = '<span style="color:#';
$sTagEnd = ';">';
$sCloseTag = '</span>';
$sReplacement = $sTagBegin . '$1' . $sTagEnd . '$2' . $sCloseTag .
$sTagBegin . '$3' . $sTagEnd . '$4' . $sCloseTag;
echo preg_replace($sPattern, $sReplacement, $sText);
result:
<span style="color:#FFFFFF;">topic</span> <span
style="color:#999999;">rest of the text</span>
____________________________
It's tempting to write the pattern more
succinctly to take advantage of the repeating pattern of the source text:
<c> COLORCODE text
The regexp pattern might be:
1) \s*
2) <c>\s
3) ([0-9A-F]{6})
4) \s
5) ([^<]+)
1) optional whitespace
2) <c> + whitespace
3) color code
4) whitespace
5) one or more characters until the next <
$sText = '<c> FFFFFF topic <c> 999999 rest of the text';
$sPattern = '/\s*<c>\s([0-9A-F]{6})\s([^<]+)/i';
preg_match_all($sPattern, $sText, $aMatches);
result:
Array
(
[0] => Array
(
[0] => FFFFFF topic
[1] => 999999 rest of the text
)
[1] => Array
(
[0] => FFFFFF
[1] => 999999
)
[2] => Array
(
[0] => topic
[1] => rest of the text
)
)
In this case, we need to specify the tag pattern only once:
$sReplacement = $sTagBegin . '$1' . $sTagEnd . '$2' . $sCloseTag;
echo preg_replace($sPattern, $sReplacement, $sText);
result:
<span style="color:#FF0000;">topic </span> <span
style="color:#00FF00;">rest of the text</span>
Notice is that this results in whitespace after
the topic string. Someone more knowledgeable in
regular expressions can probably tell you how to
eliminate that, perhaps by using a regexp assertion:
http://php.net/manual/en/reference.pcre.pattern.syntax.php#regexp.reference.assertions
Regards,
Paul
__________________________
Paul Novitski
Juniper Webcraft Ltd.
http://juniperwebcraft.com
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php