On 5/20/2010 12:43 PM, Ashley Sheridan wrote:
On Thu, 2010-05-20 at 12:40 -0400, Al wrote:
On 5/20/2010 12:02 PM, Jim Lucas wrote:
Al wrote:
On 5/20/2010 11:23 AM, David Otton wrote:
On 20 May 2010 15:52, Al<news@xxxxxxxxxxxxx> wrote:
I agree blacklisting is a flawed approach in general. My approach is to
strictly confine entry text to a whitelist of benign, acceptable
tags. The
But that's not what you've done. You've blacklisted the following
patterns:
"\<script\x20",
"\<embed\x20",
"\<object\x20",
'language="javascript"',
'type="text/javascript"',
'language="vbscript\"',
'type="text/vbscript"',
'language="vbscript"',
'type="text/tcl"',
"error_reporting\(0\)",//Most hacks I've seen make certain they turn
of error reporting
"\<?php",//Here for the heck of it.
and allowed everything else. A couple of examples:
You haven't blacklisted<iframe>
<IMG SRC="javascript:alert('XSS');"> would sail straight through that
list.
I can't tell from that list alone, but are your checks
case-insensitive? Because<ScRipT> would pass through a case-sensitive
check.
We can go on like this all day, and at the end of it you still won't
be sure you've blacklisted everything.
The first answer at
http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags
is related, also.
I'm not being clear. First pass is thru the blacklist, which effectually
tells hacker to not bother and totally deletes the entry.
If the raw entry gets past the blacklist, it must then only contain my
whitelist tags. e.g., the two examples you cited were caught by the
whitelist parser.
What exactly does your whitelist parser do?
It posts an error message that shows the user what the error is [e.g.,
"<iframe> is an invalid tag. Your text cannot posted until all errors are
corrected."
Only when the submitted raw text passes the blacklist and whitelist, will the
raw text be saved and be available for on-the-fly conversion to html.
And yes, I'm using preg_match() with the "i" arg.
Note, my blacklist is not looking for tags per se, just the start of a
bad tag. My users are only suppose to be entering plain text with some
nice highlighting and lists, etc. The editor will not post anything else.
But who say I have to use your editor?
No one says you must by my editor.
Al...
I'm methodically going thru ha.ckers tests and so far my filters have caught
everything.
I greatly appreciate everyone's help.
I think Jim meant how is your whitelist operating, not what it does to
the user. Posting a message saying that<iframe> tags are not allowed
sounds more like a blacklist type of behaviour.
A whitelist should consider the data sent from the user as bad, and only
allow it through if it meets certain criteria. By checking specifically
for an<iframe> tag and being able to warn the user specifically, you're
just using a blacklist not a whitelist.
Thanks,
Ash
http://www.ashleysheridan.co.uk
No, no, it's truly a whitelist. Every tag that is not in the list is designated
as not allowed. If anyone is interested here is my whitelist. I also use these
for html validity and nesting checking, etc. Note, they are listed by html type.
<img> and <a> use are very constrained. <img> can only point to an image file on
the server and <a> is checked for syntax and even that it points to a valid URL.
//region******** Usable XHTML elements for user entered raw text [Only these
XHTML tags can be used] ********
$inlineHtmlTagsArray = array('a', 'b', 'img', 'em', 'option', 'select', 'span',
'strong',); //Note img is both empty and inline
$blockHtmlTagsArray = array('div', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'p', 'pre',);
$emptyHtmlTagsArray = array('br', 'hr', 'img',);
$listHtmlTagsArray = array('li', 'ol', 'ul');
$tableHtmlTagsArray = array('col', 'table', 'tbody', 'td', 'th', 'thead', 'tr',);
//endregion
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php