Mike MacCana wrote:
.
They (meaning engineers at redhat) are discussing this. The solution
won't use Lucene, as Lucene treats all fine content as equal - ie, it
doesn't know about headings being different from body text and so on.
Mike
Also, Lucene suffers from the Java UCS-16 scandal: they chose a
character encoding which is good for Japanese, but bulks up european
languages by a factor of two and doesn't support enough characters to do
a good job with Chinese.
Because of this, Lucene loses a factor of two in performance
compared to C++ competitors such as Xapian, which is a minus for those
who care about performance on computers that aren't monster servers with
8 megs of RAM and Ultra 320 disks. (Funny enough, we're not all that
happy with Lucene performance on such a machine... But we've got a lot
of text...)
--
fedora-devel-list mailing list
fedora-devel-list@xxxxxxxxxx
http://www.redhat.com/mailman/listinfo/fedora-devel-list