On Mon, Aug 1, 2011 at 9:42 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote: > On Mon, 1 Aug 2011, Tommi Virtanen wrote: >> On Mon, Aug 1, 2011 at 09:24, Tommi Virtanen >> <tommi.virtanen@xxxxxxxxxxxxx> wrote: >> > We've talked about generating/parsing JSON a few times, and how we've >> > run into edge cases whenever we've rolled our own functions for that. >> > I've mentioned this C library a few times, but I'm not sure if I've >> > actually sent the link to anyone.. Here's a C library for >> > generating/parsing JSON, written by an ex-cow-orker of mine. >> >> D'oh! >> >> https://github.com/akheron/jansson > > Ah, yeah. The problem with the current code is it's assuming the > string to dump is ASCII. We need to do this: > > https://github.com/akheron/jansson/blob/master/src/dump.c#L67 > That piece of code only comes into effect if you pass in 1 for "ascii". Basically, that activates a special mode in the Jansson library where it mashes all utf-8 into ascii with escape sequences. However, you do not need to activate this mode, and you probably shouldn't, because as RFC4627 says ( http://www.ietf.org/rfc/rfc4627.txt ): "JSON text SHALL be encoded in Unicode. The default encoding is UTF-8." JSON only gives you a 4 byte long escape sequence. As you may know, a 4-byte escape sequence cannot fully represent an arbitrary utf-8 character. This is unfortunate, but it doesn't really matter, because the only characters you NEED to escape are control characters. See this parse table for json: http://www.json.org/string.gif So basically, there is no problem with using utf8 in JSON. We already have a function to do the necessary escaping of slash, quote, and so on, in rgw_escape.c. I wrote it when fixing bug #939: http://tracker.newdream.net/issues/939. It lives here: http://ceph.newdream.net/git/?p=ceph.git;a=blob;f=src/rgw/rgw_escape.c;h=aa19720f43e75e4ebb6db4f88993e8c4830421f8;hb=master And yes, it has unit tests. :) cheers, Colin -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html