Hi, On 5/27/05, Ed Finkler <coj@xxxxxxxxxxxxxxxxx> wrote: > The php mysql api has a function "mysql_real_escape_string" that seems > to be able to thwart known SQL injection attacks -- at least the ones of > which I and other people I've discussed this with know. I am curious to > know if pg_escape_string is as effective. If not, what would need to be > modified to make it more effective? I didn't hang around and began to trace mysql.com for a CVS repository. After a five minutes of search, I found that MySQL uses BitKeeper [1]. Now next thing to do is to find where mysql_real_escape_string() [2] is defined in. And the result is: mysys/charset.c [3] [1] http://mysql.bkbits.net:8080/mysql-5.0/ [2] There're two functions defined in PHP for escaping MySQL statements: mysql_escape_string() and mysql_real_escape_string(). When I looked at the MySQL source code, I found that, both of 'em refers to escape_string_for_mysql() function. (As I saw, there's not any escape routine specialized for binary data.) [3] http://mysql.bkbits.net:8080/mysql-5.0/anno/mysys/charset.c@xxxxx I found below replacements in escape_string_for_mysql() function: /* {{{ Code snippet */ #ifdef USE_MB /* If the next character appears to begin a multi-byte character, we escape that first byte of that apparent multi-byte character. (The character just looks like a multi-byte character -- if it were actually a multi-byte character, it would have been passed through in the test above.) Without this check, we can create a problem by converting an invalid multi-byte character into a valid one. For example, 0xbf27 is not a valid GBK character, but 0xbf5c is. (0x27 = ', 0x5c = \) */ if (use_mb_flag && (l= my_mbcharlen(charset_info, *from)) > 1) { *to++= '\\'; *to++= *from; continue; } /* [GBK: Encoding standard for Simplified Chinese, used in the People's Republic of China and in Singapore.] */ #endif switch (*from) { case 0: /* Must be escaped for 'mysql' */ *to++= '\\'; *to++= '0'; break; case '\n': /* Must be escaped for logs */ *to++= '\\'; *to++= 'n'; break; case '\r': *to++= '\\'; *to++= 'r'; break; case '\\': *to++= '\\'; *to++= '\\'; break; case '\'': *to++= '\\'; *to++= '\''; break; case '"': /* Better safe than sorry */ *to++= '\\'; *to++= '"'; break; case '\032': /* This gives problems on Win32 */ *to++= '\\'; *to++= 'Z'; break; default: *to++= *from; } /* }}} */ As I saw, MySQL follows different methods to escape input data. But AFAIC, the method followed depends a little bit on the software architecture, like handling literals from pointers. Thus, it shouldn't be an objective comparision when we just look at "which one escapes which char". Anyway, I hope above escaping differences helps folks. I also want to add something about using parameters instead of escaping. There're lots of advantages of using parameters when compared to escaping: o. You just pass the value and don't need to worry about any SQL-Injection attack. o. When you try to escape a data, CPU usage increases with linear proportional to data length. (Try escaping an MP3 file.) o. As far as I understand from the length of this post's thread, you can never be sure about the safety of any escaping method. But as Ed Finkler underlined, most of the programmers prefer escaping while sending SQL statements. So "parameter usage is more secure" shouldn't be an answer for "are our escaping routines enough" question. Regards.