Regex compatibility

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Michael,

On Thu, 2006-01-26 at 14:32 +0000, Michael Kay wrote:
> I'm finding most things are working pretty well first time; the
> discrepancies I'm seeing in my test suite results are primarily:
> 
> * numeric precision and formatting
> * URI handling
> * collation support
> * regular expressions
> 
> (which is pretty much what I would have expected).

Please do report all issues you find even if you decide to work around
them for the time being. We try to be pretty responsive to bug reports
and it helps others hitting similar issues if we are unable to
immediately fix things to have them in our bug database.

Alternatively, please explain how to run the test suite and people can
try with different GNU Classpath based runtimes to compare the results.

> The topic here is regular expressions, where I am seeing a lot of tests
> fail. Is there any information available as to the precise specification of
> the regular expressions supported by Gnu Classpath, and/or the differences
> from the Sun JRE?

The regular expression support was originally based on GNU Regexp. Which
had a slightly different regular expression syntax as described at:
http://www.cacas.org/java/gnu/regexp/syntax.html

Ito has been working hard to get the syntax compatible with the one
expected from util.regex. Most of his fixes are very recent though and
have been added after the 0.20 snapshot. If possible you can try out a
more recent build as can be found in CVS of from the builder distcheck:
http://savannah.gnu.org/cvs/?group=classpath
http://builder.classpath.org/dist/

I see we are mislabling the 0.21-pre as 0.20 on builder, I'll fix that
in a minute.

> Saxon supports the regular expressions of XML Schema and XPath 2.0 by
> translating them into the regular expression dialect supported by Java (the
> translator to do this was originally developed by James Clark). It's quite
> possible that this produces some rather unusual regex patterns as a result.
> It's not the end of the world if the Gnu dialect is slightly different,
> because I can modify the translator to cope. But I do need to know what
> specification I am working to!
> 
> Alternatively I can target the regex library on the .NET platform.

That is a cool alternative that ikvm provides. You can also look at
other regular expression libraries as a workaround for now. A (somewhat
dated) comparison can be found in:
http://www.oreilly.com/catalog/regex2/chapter/ch08.pdf

But please do keep bug reports coming, we want to make GNU Classpath as
easy and compatible as possible so you don't have to add too many
(preferably none!) workarounds when porting.

Cheers,

Mark

-- 
Escape the Java Trap with GNU Classpath!
http://www.gnu.org/philosophy/java-trap.html

Join the community at http://planet.classpath.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://developer.classpath.org/pipermail/classpath/attachments/20060127/457a8ba7/attachment.pgp

[Index of Archives]     [Linux Kernel]     [Linux Cryptography]     [Fedora]     [Fedora Directory]     [Red Hat Development]

  Powered by Linux