>>Re-read his example. He encodes the data in PHP. But decodes the data in SQL. So, if you echo the SQL statement, you would see a base64 encoded string that SQL then decodes. Got it this time! Up until reading your reply, I was reading Alex's example with my pseudo-code glasses. I did not realize that the decoding was being done by SQL! I though it was still in PHP. And that's where I got confused with the hey why not string casting it then and got into what's the difference situation. But, you were laser sharp on that! Thanks a bunch! >> as to the other issue, the one with utf-8 and mb_detect_encoding, not working for it - cause there are ways of getting around. I still don't get it. First q comes to mind, why the heck use mb_detect_encoding then if it can be hacked around? see what I'm saying. but i don't want to go off on a tangent.. all i'm trying to do is to safely protect myself from a possible sql injection by using the available filters and sanitizations and techniques but without the PDO. That's the requirement. No PDO. From the earlier recommendations, I understand PDO is the way to go - cause it effectively separates the sql code from the user input to make sure user input does not get executed.. that explanation ... i get that... no problems there... yes, do use PDO... but my question is not what's the safest way in general?. But rather, what's the safest way without the PDO? Without the PDO, it seems like b64'ing it will do the job! And since the data will be stored as clear text, the searches against that data will also work too. I can take this implementation and build my library function based on that - instead of making it 1- first check if the in user string is in utf-8, 2- reject the input if not in utf-8 3- accept the input if utf-8 and apply the applicable filters to it starting with filter_sanitize_string 4- and on top of that, also mysql_real_escape it but from what i understand, you guys are saying just don't do this, because it may be overcome and that's not because of the fact filter_sanitize_string or mysql_real_escape_string is not effective, but because of the fact that there is NO WAY to reliably detect whether the incoming user input is in utf-8 or not. On Thu, Jan 26, 2012 at 9:14 AM, Jim Lucas <lists@xxxxxxxxx> wrote: > On 01/26/2012 06:46 AM, Haluk Karamete wrote: >> >> when we do b64e and then back b64d, you are saying. we get the org >> input all as clear text but this time as a string. because it is now a >> string, "(which by definition can not be executed)" >> >> what's the difference between b64e+b64d vs (string) casting then? if >> you were to cast the original input into string using (string), >> wouldn't you be in the same shoes? > > > Re-read his example. He encodes the data in PHP. But decodes the data in > SQL. So, if you echo the SQL statement, you would see a base64 encoded > string that SQL then decodes. > > >> >> also on another note, if you know the userinput is in UTF-8, ( you >> verify that by running mb_detect_encoding($str, 'UTF-8', true); ), is >> there a situation where you think mysql_real_escape_string would fail >> in SQLINjection against string based user input ? The reason I ask >> this about specifically for strings is because it is fairly easy to >> validate againsts integers,floats,booleans using the built in >> validation filters.... my biggest issue is on strings... >> >> also what do you think about filter_sanitize_string. > > > read this: > > http://www.php.net/manual/en/filter.filters.sanitize.php > > Then read this: > > http://www.php.net/manual/en/filter.filters.flags.php > > It seems to me that filter_sanitize_string does not deal with anything other > then ASCII. > > YMMV > > >> >> and finally, where do you think PHP community plus Rasmus is having a >> hard time implementing what you have in mind - that is a one liner >> that will do the inline string interpolation you are talking about.. >> what's the issue that it hasn't been done before? >> >> >> >> On Tue, Jan 24, 2012 at 1:45 PM, Alex Nikitin<niksoft@xxxxxxxxx> wrote: >>> >>> You don't need to store it in the database as b64, just undo the >>> encoding into your inputs >>> >>> for the purpose of the explanation, this is language independent >>> >>> b64e - encoding function >>> b64d - decoding function >>> >>> >>> pseudo code >>> >>> given: >>> bad_num = ') union select * from foo --' >>> bad_str = "" >>> good_num = 123456 >>> good_str = "some searchable text" >>> >>> the b64 way: >>> bad_num=b64e(bad_num) >>> ... >>> good_str=b64e(good_str) >>> >>> >>> inserts: >>> query("insert into foo (num, str) values (b64d(\""+bad_num+"\"), >>> b64d(\""+bad_str+"\"))"); >>> query("insert into foo (num, str) values (b64d(\""+good_num+"\"), >>> b64d(\""+good_str+"\"))"); >>> >>> Can you see that this will safely insert clear text into the database? >>> This is because when you convert anything from b64, it will return >>> from the function as a string and will not be executed as code... >>> >>> >>> Now let's try a search: >>> bad_num= '1 or 2 not like 5' >>> bad_str = "' or \"40oz\" like \"40oz\"" >>> >>> again we: >>> bad_num=b64e(bad_num) >>> bad_str=b64e(bad_str) >>> >>> then we can do a full text search: >>> query("select * from foo where match(str) >>> against(b64d(\""+bad_str+"\"))") >>> or even a number search >>> query("select * from foo where num=b64d(\""+bad_num+"\")") >>> >>> again this is possible because no matter what you put in bad num, it >>> will never be able to make post b64e bad_num look like code, just >>> looks like junk, until b64d converts it to a string (which by >>> definition can not be executed) >>> >>> make sense now? >>> >>> >>> by check i mean, run utf8_decode for example... >>> >>> >>> Problem is, that i can tell you how to write the most secure code, but >>> if it's hard, or worse yet creates more problems than it solves >>> (seemingly), nobody other than a few individuals with some passion for >>> security will ever find the code useful. We need to fix this on the >>> language level, then we can go around and tell programmers how to do >>> it right. I mean imagine telling a programmer, that something that >>> takes them 2 lines of code now, can be done much more securely in 5-7, >>> and it creates code that doesn't read linearly... Most programmers >>> will just ignore you. I want to say, "hey programmer, what you do in 2 >>> lines of code, you can do in 1 and make it impossible to inject into", >>> then, then people will listen, maybe... This is where inline string >>> interpolation syntax comes in, but it is not implemented in any >>> programming languages, sadly actually. This is what i want to talk to >>> Rasmus about. >> >> > > > -- > Jim Lucas > > http://www.cmsws.com/ > http://www.cmsws.com/examples/ > http://www.bendsource.com/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php