IBM Informix Web DataBlade: Auto-decoding HTML entities

Simon Lodal <simonl@mirrormind.com> · Thu, 11 Apr 2002 17:00:11 +0200

IBM Informix Web DataBlade: Auto-decoding HTML entities

By Simon Lodal, Denmark
Vendor status: Notified months ago, said they would be working on
updates, never heard anything.
Software: Web DataBlade 4.12, IDS 9.20/9.21, Linux 2.2/2.4, SunOS 5.7
(OS, IDS and WDB versions seem to be irrelevant).

Impact: Malicious user may insert SQL code in form input and have it
executed, even if the developer has escaped the input properly; Web
DataBlade actively unescapes the input afterwards. This is serious since
your code *looks* like it escapes the input, so an auditor will not
catch the error unless he actually looks at the data at the point of
execution in the database engine. It is much harder to spot than to just
find input that is not escaped at all, and therefore likely to exist in
great amounts of WDB code. While this may not strictly be a security bug
in itself it certainly fools the developer and leads to holes.

Workaround: Run input data through $(WEBUNHTML) twice. The SQL
interpreter will HTML decode that string, but then the first level of
HTML encoding will be preserved, ie. there can be no <>"& characters in
it. - Bad thing is that once Informix fixes this, you will need to
revert all code to only call $(WEBUNHTML) once.

-------
Details

HTML encoded strings are automatically being decoded when used in SQL
statements. It causes developers to create code that looks fine but
actually contains holes, since the logic is circumvented by WDB.

Any worthy web/database programmer checks all user input before using it
in an SQL query. WDB has a function $(WEBUNHTML) which converts the
characters <>"& to their HTML entities. When a string has been
$(WEBUNHTML)'ed it should thus be safe to use it in an SQL query,
provided that you enclose the string in double quotes (there can not be
any doublequotes inside the string).

But somewhere on the path before the SQL query is being executed the
HTML entities are actually decoded into their original character
representations. I have not found this documented anywhere, and even if
it is documented I would consider it a bug, since this "feature"
certainly breaks the "least surprise" principle, which is a bad thing to
do in security related areas.

Example:

<!-- Make inputstr harmless -->
<?MIVAR NAME=inputstr>$(WEBUNHTML,$inputstr)<?/MIVAR>
<!-- Build query to insert the checked string -->
<?MIVAR NAME=qstr>INSERT into mytable VALUES ("$inputstr")<?/MIVAR>
<!-- Execute query -->
<?MISQL SQL="$qstr"><?/MISQL>

Besides of being an exampe of just how ugly WDB code is, this code looks
correct; it runs $inputstr though the $(WEBUNHTML) function before
inserting. But the query will actually fail if the original $inputstr
contained a double quote, and it can therefore be exploited to execute
other SQL code. The string is HTML decoded again somewhere, that is, the
&quot; is converted back to a real doublequote.

At first one may think that all the user can do is to make a query fail,
by inserting just one quote somewhere, and that the attacker would have
to know the exact query in order to actually make it succeed while being
circumvented. But it is much simpler than that. The webexplode()
function will always be available, and it can be used to execute SQL of
choice. Since it returns string data it can simply be concatenated to
other string data, thus executing any SQL, even without interrupting the
original query.

Proof of concept: Given the code above, the malicious user would have to
put something like the following into an "inputstr" field in an HTML
form and submit it:
" || webexplode("<?MISQL SQL='INSERT INTO sysusers VALUES
(...)'><?/MISQL>", NULL) || "

This INSERT query writes to a sensitive part of the database, and
returns nothing at all. The query on the HTML page would therefore
succeed; nothing is actually changed in the input that it sees. And the
attacker does not even have to know the query that is circumvented.

Simon Lodal