On Sun, Jul 02, 2006 at 12:25:43PM -0400, Agent M wrote: > On Jul 2, 2006, at 6:13 AM, Martijn van Oosterhout wrote: > >But I don't think anyone is actually considering importing ICU into the > >postgres source tree, are they? > Why not? Because it's a project of similar size to postgres and probably nearly as old and I don't think anyone here actually wants to maintain it. I mean, we could incorporate the source for readline, openssl, kerberos, the C library but why. That project has maintainers already and we only wan to use it, not fork it. > >If you drop the conversion stuff (because postgres already has that) > >you're down to about 4MB. > Why would you drop the ICU transcoding support instead of the existing > postgres functions? Why the duplicated effort? Because we would want to be bug-for-bug compatable to previous releases. I suppose it would be possible if someone checked that the end result is the same. > Certain Japanese characters cannot make a reliable round-trip through > Unicode. ICU uses UTF-16 as its store, so the Japanese folks won't be > happy with an ICU-only solution. However, it would still be of great > benefit to allow ICU to handle as much as possible, leaving the string > encodings to the encoding experts. We don't need round-trip through unicode, since we're only doing one way conversions for the purpose of collation. BTW, this site seems to have a good discussion of Japanese characters and Unicode. http://www.jbrowse.com/text/unij.html > At the very least, it would be great to have ICU to handle encoding on > a per-column basis (perhaps extending the text datatype with encoding > info). Perhaps this would be a decent stopgap solution? The backend > protocol would also need a version bump- currently, it converts all > strings to a single encoding. That's called SQL COLLATE support and that's an order of magnitude harder than adding support for ICU. See previous dicussion on -hackers. Have a nice day, -- Martijn van Oosterhout <kleptog@xxxxxxxxx> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment:
signature.asc
Description: Digital signature