Re: Preventing SQL Injection/ Cross Site Scripting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 24/04/07, Justin Frim <jfrim@xxxxxxxxxxx> wrote:
Just my two cents worth...

Magic quotes are the work of the devil.  It's a shame that so many PHP
installations have them enabled, and a huge disappointment that PHP is
actually distributed with this stuff enabled!  The mere fact that a
script can't change this setting creates a real hassle and is my major
gripe about the whole situation.  I've *always* followed the programming
practice of "work with your data unencoded, then encode it appropriately
only at the last final output stage".  That way you always know exactly
what you're working with, no surprises, where each character is always 1
byte, regardless of what character it is.  Here's a typical block of
code which I include in the start of nearly all my PHP scripts:

<?php
//Do not delete this function! (unless you don't mind data corruption
with PHP's default settings)
function stripslashes_deep($value) {
  return is_array($value) ? array_map('stripslashes_deep', $value) :
stripslashes($value);
}
//Get rid of those stupid damn annoying asanine magic quotes which just
garble up your data.
if (get_magic_quotes_gpc()) {
  /*
  (unfortunately in PHP these are enabled by default.  AHH!  Which idiot
  thought this was a good idea to turn them on by default?  Good programming
  practise is to manually encode only the data that requires encoding just

You've got a typo in practice.

  just before dumping it to places which need it (ie. databases), not
  automatically screwing up the entire collection of the system's variables!
  AHH!)
  */
  $GLOBALS['HTTP_POST_VARS'] =
stripslashes_deep($GLOBALS['HTTP_POST_VARS']);
  $GLOBALS['_POST'] = stripslashes_deep($GLOBALS['_POST']);
  $GLOBALS['HTTP_GET_VARS'] = stripslashes_deep($GLOBALS['HTTP_GET_VARS']);
  $GLOBALS['_GET'] = stripslashes_deep($GLOBALS['_GET']);
  $GLOBALS['HTTP_COOKIE_VARS'] =
stripslashes_deep($GLOBALS['HTTP_COOKIE_VARS']);
  $GLOBALS['_COOKIE'] = stripslashes_deep($GLOBALS['_COOKIE']);
  $GLOBALS['HTTP_SERVER_VARS'] =
stripslashes_deep($GLOBALS['HTTP_SERVER_VARS']);
  $GLOBALS['_SERVER'] = stripslashes_deep($GLOBALS['_SERVER']);
  $GLOBALS['HTTP_ENV_VARS'] = stripslashes_deep($GLOBALS['HTTP_ENV_VARS']);
  $GLOBALS['_ENV'] = stripslashes_deep($GLOBALS['_ENV']);
  $GLOBALS['HTTP_POST_FILES'] =
stripslashes_deep($GLOBALS['HTTP_POST_FILES']);
  $GLOBALS['_FILES'] = stripslashes_deep($GLOBALS['_FILES']);
  $GLOBALS['_REQUEST'] = stripslashes_deep($GLOBALS['_REQUEST']);
}
set_magic_quotes_runtime (0);   //Fortunately these can be killed with a
single statement, unlike magic_quotes_gpc
?>

That's bad. For a function that was meant to make life easier, magic
quotes sure has caused a bit of problems. I believe that it will be
not available in php6.

Eh, don't mind the comments.  Sometimes PHP programming can become quite
frustrating.  ;-)


On to the next stage... encoding data for output to an HTML document.

Personally, I prefer using htmlspecialchars() over htmlentities(),
because it only converts those characters that *must* be converted for
HTML ( & < > " ).  There's no use in turning your other 1-byte
characters into 5, 6, or 7-byte strings, if you already provided the
correct character set in the Content-Type HTTP header (as you should!).

Actually, if you want to get really picky, I usually use the following
conversions:

For most tag parameters: htmlspecialchars($tagdata)

For display text: nl2br(htmlspecialchars($displaytext))
(This keeps newline sequences in effect.)

For text which may contain a few control characters, special characters,
or other binary data (sometimes useful in hidden form fields, or for
special accented characters and non-english languages):
preg_replace('/([\\x00-\\x1F\\x7F-\\xFF])/e','"&#".ord(substr("$1",-1)).";"',htmlspecialchars($binarytext))
(This encodes the data in a mostly still human-readable form, entirely
with 7-bit ASCII characters only.)

For binary data (sometimes useful in hidden form fields):
strtr(base64_encode($binarydata),'+/=','-_.');
(All the advantages of Base64 encoding, without incurring any overhead
from URL encoding when the form is submitted.)


Anyhow, back on track to the original topic of this thread.  For
anything that gets written to a database or used for a query, I suggest
escaping the data using a function specifically designed for that
database.  (And there are many different functions for the many
different popular databases.)  This should have *nothing* to do with
blocking XSS, turning < into &lt;, etc.  Preparing for the database
query string is no place to do the data conversion which will be
necessary for the final output.

I took chris's advice and filter for XSS after the info is retrieved
from the database, before sending it to the browser.

The last topic... blocking XSS attacks.  If you use the encoding
routines I listed above for outputting to HTML documents, you're already
safe.  And you're not outlawing any characters either... if someone
wants to type < and >, or show semi-colons or whatever, they can,
knowing with certainty that what they type is exactly what others will
see.  If you need to let users enter some mark-up, do what message
boards and web log sites have been doing for years: BBcode.  Then you
can write your own routines to provide only the features you need, using
a code format that's much stricter than HTML.  This can greatly simplify
your markup code engine too, compared to making a selective HTML filter.


In any case, here's the data flow (in my wonderful ASCII-art)  ;-)  :

For input:
User input / source data
      \/
Database escaping function
      \/
Assemble database query string

For basic output:
Source data
      \/
HTML encoding algorithm [most likely nl2br(htmlspecialchars())]
      \/
user-agent (ie. site visitor's web browser)

For fancy output:
Source data
      \/
BBcode interpreter engine and HTML tag assembler  <--------> HTML
encoding algorithms
      \/
user-agent (ie. site visitor's web browser)



Follow these guidelines, and your scripts will be 100% binary-safe,
secure from XSS attacks, immune to SQL injection attempts, and very
user-friendly since users have the entire character set available to
them without any constraints.


Thanks. Most of that has already been done now, but I'll certainly
keep your functions handy. I'll likely need them at some point.

Dotan Cohen

http://dotancohen.com/howto/firefox_password_manager.php
http://lyricslist.com/lyrics/artist_albums/228/gordon_nina.html

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux