Re: Converting HTML to BBCode [medium]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



First of all, the back-slashes added before a " character is probably because of the gpc_magic_quotes directive in PHP, wich tries to "escape" the quotes (pretty stupid, if you ask me), so you must have to use strip_slashes() on the string you received, e.g:
  $text = '';
  if ( isset($_POST['text']) ) {
      $text = $_POST['text'];
      if ( get_magic_quotes_gpc() ) {
          $text = stripslashes($text);
      }
  }

Now, what you're trying to do is definetely not something "basic", since you want to replace some non-fixed strings that can either be in lower or uppercase (and without changing the case of the rest of the text), so basicaly what you have are patterns (some kind of 'rules' that shall be followed by the tags)

By your code I can tell you've already try a little the hard way to solve this issue, although it would be quite more laborious than that because you would have to search the string almost char-by-char (in a figurative way, but pretty much what PHP would be doing) for all the tags you want to replace, and possibly be working with two strings: one for the original text and other with a lowercase version of it (since you cannot search in a case-insensitive way --only in PHP5)

Anyway, the medium/advanced way (IMHO) would be to use regular expressions. These are quite useful, but also rather cryptic, even for advanced users --sometimes it's easier to come up with a new one rather than understanding what already exists :p

	The function I've test with your test HTML-code is this one:
  /**
   * Performes BBCode conversion for some simple HTML elements
   *
   * @staticvar string  $str_http_valid
   * @staticvar array   $arr_replace
   * @param     string  $string
   * @return    string
   * @since     Mon Mar 06 23:44:40 CST 2006
   * @author    rsalazar
   */
  function to_bbcode( $string ) {
    static  $str_http_valid = '-:\/a-z.0-9_%+';
            $arr_replace    = array(

"/<a\s+.*?(?<=\b)href=(?(?=['\"])(?:(['\"])(.*?)\\1)|([$str_http_valid]*)).*?>(.+?)<\/a>/Xis"
                         => '[link=\\2\\3]\\4[/link]',

"/<img\s+.*?(?<=\b)src=(?(?=['\"])(?:(['\"])(.*?)\\1)|([$str_http_valid]*)).*?\/?>/Xis"
                         => '[img]\\2\\3[/img]',
'/<(\/)?(strong|em)>/Xise' => '( strcasecmp("em", "\\2") ? "[\\1b]" : "[\\1i]" )',
          '/<(\/?(?:b|i|u))>/Xis'  => '[\\1]',

          '/<(\/)?[ou]l>/Xis'    => '[\\1list]',
          '/<(\/)?li>/Xise'      => '( "\\1" == "" ? "[*]" : "" )',
        );
    $string = preg_replace(array_keys($arr_replace),
                           array_values($arr_replace),
                           $string);
    return  $string;
  }

As I mentiones before, keep in mind that reg-exp can be rather cryptic sometimes. Also, this is the raw code, it should be optimized but I'm feeling really lazy right now, so it should have to wait for a better ocasion.

It's up to you to decide wheter you'll use this function or not, what I would recommend you is not to forget about regexp and give them a try later (when you're more familiar with PHP), and I would also recommend you to use PREG family rather than EGREP.

J_K9 wrote:
Hi,

I'm trying to code a PHP app to convert my inputted HTML code (into a textarea) into BBCode, for use on a forum. I have tried to code it, but have had little success so far. Here is the code I wrote (sorry, I'm still learning):

-------CODE-------
<html>
<head>
<title>Convert from HTML to BBCode</title>
</head>
<body>
<form action="<?php echo $_SERVER['PHP_SELF']; ?>" method="post">
    Body:  <br /><textarea name="text"></textarea><br /><br />
    <input type="submit" name="submit" value="Submit me!" />
</form>
<?php

$text = $_REQUEST['text'];

echo '<br /><br />';

// Declare HTML tags to find, and BBCode tags to replace them with

$linkStartFind = '<a href="';
$linkStartReplace = '[url=';
$linkEndFind = '</a>';
$linkEndReplace = '[/url]';

$italicStartFind = '<em>';
$italicStartReplace = '[i]';
$italicEndFind = '</em>';
$italicEndReplace = '[/i]';

$boldStartFind = '<strong>';
$boldStartReplace = '[b]';
$boldEndFind = '</strong>';
$boldEndReplace = '[/b]';

$imgStartFind = '<img src="';
$imgStartReplace = '[img]';
$imgEndFind = ' border="0" />';
$imgEndReplace = '[/img]';

$OLBeginFind = '<ol>';
$OLBeginReplace = '';
$OLFinishFind = '</ol>';
$OLFinishReplace = '';

$listStartFind = '<li>';
$listStartReplace = '[list]';
$listEndFind = '</li>';
$listEndReplace = '[/list]';

// Replace.

$text = str_replace($linkStartFind, $linkStartReplace, $text);
$text = str_replace($linkEndFind, $linkEndReplace, $text);
$text = str_replace($italicStartFind, $italicStartReplace, $text);
$text = str_replace($italicEndFind, $italicEndReplace, $text);
$text = str_replace($boldStartFind, $boldEndReplace, $text);
$text = str_replace($boldEndFind, $boldEndReplace, $text);
$text = str_replace($imgStartFind, $imgStartReplace, $text);
$text = str_replace($imgEndFind, $imgEndReplace, $text);
$text = str_replace($OLStartFind, $OLStartReplace, $text);
$text = str_replace($OLEndFind, $OLEndReplace, $text);
$text = str_replace($listStartFind, $listStartReplace, $text);
$text = str_replace($listEndFind, $listEndReplace, $text);

echo '<textarea name="output">' . "$text" . '</textarea>';

?>
</body>
</html>
-------/CODE-------

Now, most of this doesn't work. Here is the test code I put into the first textarea:

-------TESTCODE-------
<strong>Testing bold code</strong>

<em>Testing italics</em>

<a href="http://link.com";>Testing link</a>

<img src="http://image.com/img.jpg"; border="0" />

<img src="http://image.com/img2.jpg"; style="padding-right: 5px;" border="0" />
-------/TESTCODE-------

And here's what I got out:

-------RESULT-------
[/b]Testing bold code[/b]

[i]Testing italics[/i]

<a href=\"http://link.com\";>Testing link[/url]

<img src=\"http://image.com/img.jpg\"; border=\"0\" />

<img src=\"http://image.com/img2.jpg\"; style=\"padding-right: 5px;\" border=\"0\" />
-------/RESULT-------

As you can see, the bold, italic, and ending hyperlink tag replacements worked, but the rest didn't. Backslashes have been added where there are "", and if there were anything between an img tag's 'src="{image}"' and ' border="0" />' that wouldn't be removed, and therefore provide me with a faulty link.

Just to clarify the BBCode tags, they are:

[url=http://link.com]Click this link[/url]
[img]http://imagesite.com/image.jpg[/img]
[b]_BOLD_[/b]
[i]italicised[/i]
[u]underlined[/i]

I would really like to get this working, as it'll not only help me improve my PHP skills but also aid my tutorial conversions - it takes ages to do this by hand ;)

Any help would be appreciated. Thanks in advance,
--
Atentamente,
J. Rafael Salazar Magaña
Innox - Innovación Inteligente
Tel: +52 (33) 3615 5348 ext. 205 / 01 800 2-SOFTWARE
http://www.innox.com.mx

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux