Il giorno mer, 06/12/2006 alle 13.50 +0100, Mark Wielaard ha scritto: > Hi Mario, Hi Mark! Thank you for the quick reply! > But in the old code the decomposing is done, although not really in the > classpath case, only in the case of libgcj. Could you explain the > difference between classpath/libgcj here and how this actually helps us? > Can't we use the libgcj version? Yes, in the case of classpath it is not done at all. In case of gcj it is broken though. I'm not sure about how much broken it is. It maybe just a matter of wrong data (gcj use the UCD 3.0, jdk 1.5 uses the UCD 4.0.0). This is not that different, just few addictions, so I guess this is only part of the problem. The method used by gcj is not complete, this is sure. The javadoc says that the following rules are defined: * NO_DECOMPOSITION jdk: accented characters will not be decomposed for collation. gcj: no decomposition is performed at all. * CANONICAL_DECOMPOSITION jdk: characters that are canonical variants according to Unicode standard will be decomposed for collation. Used for accented character. gcj: read from canonical_decomposition array the values and use this array to calculate decomposition. * FULL_DECOMPOSITION jdk: Unicode canonical variants and Unicode compatibility variants will be decomposed for collation. gcj: does the same as before, using a different array. The last method should be the "Compatibility decomposition" named in the Unicode Standard, if I'm not wrong. What is clear to me is that we are doing the wrong thing here, as this class and these methods are more complex than what we have (and I fear another DecimalFormat...). > It doesn't really add or remove functionality it seems. How is the user > better of with this version than they were with the old one? Actually yes, it is just to say that we have 1.3 complete... it is of no use at all as is. > If it helps you structure the code in a way that makes improving it better please do > go for it. This is the reason. It makes sense to have all this functionality in one place, as it is related to just this class. Unless, of course, reading better the code and understanding it I find that even Collator and RuleBasedCollator are wrong (I have no reason to think that now, but I also know that this area is a bit in darkness, the javadoc does not help, and there are no effective tests in mauve). > But if the functionality doesn't really change I am not sure True, and the drawback is to fool users into thinking that we have implemented this functionality. I think I'll do as in DecimalFormat, I will keep a local branch until all the functionality are in place and then submit them for review. The Unicode standard is well documented, I "only" have to find how it is implemented in the jdk. > Mark Ciao, Mario -- Lima Software, SO.PR.IND. s.r.l. http://www.limasoftware.net/ pgp key: http://subkeys.pgp.net/ Please, support open standards: http://opendocumentfellowship.org/petition/ http://www.nosoftwarepatents.com/
Attachment:
signature.asc
Description: Questa =?ISO-8859-1?Q?=E8?= una parte del messaggio firmata digitalmente