ljb wrote: > gwchamb@xxxxxxxxx wrote: >> Apache 2.2.3, PostgreSQL 8.2.1, PHP 5.1.6, Linux >> >> I have inserted (via pg_query_params) into a bytea field some binary >> data (a JPEG image in this case) which I have escaped using >> pg_escape_bytea. It appears, however, that the extracted data is >> corrupt (NOT unescaped, more precisely), even after unescaping it with >> pg_unescape_bytea. If I perform another (a subsequent) >> pg_unescape_bytea, it appears to be partially unescaped, but there >> still remain errors because the rest of the image is severely >> distorted -- but minimally recognizeable as part of the original >> image. What am I missing? I'm using the lo_* functions as an >> alternative, but it's hard to dismiss the ease with which it appears >> to deal with binary data with a bytea field. > > Interesting problem. ... > The escaping done by pg_escape_bytea() is wrong for parameterized queries. > With a binary-mode query parameter (which pg_query_params() can't do > anyway), you want no escaping at all. With a text-mode parameter (as > pg_query_params() does), you need to escape for only the bytea-input > parsing, not the SQL parsing. So for example if your data has a byte with > value 1, you need to pass that as the 4 characters: \001. > pg_escape_bytea() returns that as the 5 characters: \\001 (unless the new > 'standard conforming strings' is on), so it won't work. Nor can I think of > another PHP escaping function that does work here. Isn't it possible to partly undo the effect of pg_escape_bytea()? Something like this: <?php /* some binary data */ $data = implode("", range("\000", "\377")); echo "pg_escape_bytea():\n"; echo pg_escape_bytea($data), "\n\n"; echo "pg_escape_bytea(), corrected:\n"; echo str_replace("\\\\", "\\", pg_escape_bytea($data)), "\n\n"; ?> Here's the result: $ php -f test.php | fold -w 70 pg_escape_bytea(): \\000\\001\\002\\003\\004\\005\\006\\007\\010\\011\\012\\013\\014\\015 \\016\\017\\020\\021\\022\\023\\024\\025\\026\\027\\030\\031\\032\\033 \\034\\035\\036\\037 !"#$%&''()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOP QRSTUVWXYZ[\\\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\\177\\200\\201\\202\ \203\\204\\205\\206\\207\\210\\211\\212\\213\\214\\215\\216\\217\\220\ \221\\222\\223\\224\\225\\226\\227\\230\\231\\232\\233\\234\\235\\236\ \237\\240\\241\\242\\243\\244\\245\\246\\247\\250\\251\\252\\253\\254\ \255\\256\\257\\260\\261\\262\\263\\264\\265\\266\\267\\270\\271\\272\ \273\\274\\275\\276\\277\\300\\301\\302\\303\\304\\305\\306\\307\\310\ \311\\312\\313\\314\\315\\316\\317\\320\\321\\322\\323\\324\\325\\326\ \327\\330\\331\\332\\333\\334\\335\\336\\337\\340\\341\\342\\343\\344\ \345\\346\\347\\350\\351\\352\\353\\354\\355\\356\\357\\360\\361\\362\ \363\\364\\365\\366\\367\\370\\371\\372\\373\\374\\375\\376\\377 pg_escape_bytea(), corrected: \000\001\002\003\004\005\006\007\010\011\012\013\014\015\016\017\020\0 21\022\023\024\025\026\027\030\031\032\033\034\035\036\037 !"#$%&''()* +,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmno pqrstuvwxyz{|}~\177\200\201\202\203\204\205\206\207\210\211\212\213\21 4\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\ 236\237\240\241\242\243\244\245\246\247\250\251\252\253\254\255\256\25 7\260\261\262\263\264\265\266\267\270\271\272\273\274\275\276\277\300\ 301\302\303\304\305\306\307\310\311\312\313\314\315\316\317\320\321\32 2\323\324\325\326\327\330\331\332\333\334\335\336\337\340\341\342\343\ 344\345\346\347\350\351\352\353\354\355\356\357\360\361\362\363\364\36 5\366\367\370\371\372\373\374\375\376\377 Isn't the latter ready for pg_query_params()? The only doubt I have is about the double '. Is it needed for bytea parsing or sql parsing? You may have to replace it with a single ', if it's only for SQL: str_replace(array("\\\\", "''"), array("\\", "'"), pg_escape_bytea($data)), "\n\n"; See http://www.postgresql.org/docs/8.2/interactive/datatype-binary.html. The man page is not clear, the single quote is listed among the escaped chars, but from the examples is seems that only SQL escaping is performed. Compare with \: SELECT E'\\\\'::bytea; -- after SQL parsing, you get \\ SELECT E'\''::bytea; -- after SQL parsing, you get ' it seems that the bytea parser is feeded with a double slash, but a single quote. If so, the octal value 39 should be removed from the table in the manual page, since it requires no escaping at bytea level. pg_escape_bytea() seems to confirm this, it returns '' for ' but \\\\ for \. \ is escaped twice, ' only once. Unfortunately I can't try it now with pg_query_params(), but I bet on the second form... well no, maybe I can, yes, here we go: <?php $data = implode("", range("\000", "\377")); $data_escaped = str_replace( array("\\\\", "''"), array("\\", "'"), pg_escape_bytea($data)); /* fill in the conn string, if required */ $db = pg_connect(""); pg_query($db, "CREATE TEMP TABLE atest (data bytea);"); pg_query_params($db, "INSERT INTO atest VALUES ($1)", array($data_escaped)); $res = pg_query($db, "SELECT data FROM atest"); $data2 = pg_unescape_bytea(pg_fetch_result($res, 0)); echo ($data2 == $data) ? "OK!" : "PANIC!", "\n"; ?> $ php -f test.php OK! Seems to work... .TM.