Re: UTF-8 support in PCRE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



Please see my reply inline below

On Fri, Jul 4, 2008 at 5:29 AM, Ralph Angenendt <ra+centos@xxxxxxxxxxxx> wrote:
Amitava Shee wrote:
> How do I get utf-8 support with PCRE?
>
> I am having problems building lucene index using Zend_Lucene. I get the
> following error
>
>
> PHP Notice:  iconv(): Detected an illegal character in input string in
> /var/www/ZendFramework-1.5.2/library/Zend/Search/Lucene/Analysis/Analyzer/Common/Text.php
> on line 56

a) What does that have to do with pcre? (which can do UTF-8)
 
[Shee] Zend lucene search engine uses pcre and requires pcre to be compiled with --enable-utf8. Please see http://framework.zend.com/manual/en/zend.search.lucene.charset.html#zend.search.lucene.charset.utf_analyzer

UTF-8 support can either be compiled into PCRE at build time or supported via shared library. But shared library support is included/excluded based on the distro. I believe, upstream RedHat does not include it. I was hoping to find a way in CentOS. I have no idea if other distro's support it. That's a research item for me.


b) What is on line 56 in that file? Looks like iconv is choking on that.
[Shee] Framework code - don't know much there


So try to process that file with iconv on the command line.

Ralph

_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos


_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos

[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux