Re: Using preg_match to find Japanese text

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, August 5, 2006 9:06 pm, Dave M G wrote:
> While I'm only just learning about regular expressions in another
> thread, I still seem to be finding exceptional situations which have
> me
> questioning the extent to which preg expressions can be implemented.
>
> (The following contains UTF-8 encoded Japanese text. Apologies if it
> comes out as ASCII gibberish.)
>
> What I have are sentences that look like this:
> æ°?温 ã??ã??ã??ã??ã?? (n) atmospheric temperature; (P); EP
> ã?«ã?¤ã??ã?¦ (exp) concerning; along; under; per; KD

Can you be sure that '(' will not appear in the Japanese part?

preg_match('/^(.*)(\\(.*$)/', $text, $parts);
echo "Japanese: $parts[1]<br />\n";
echo "Definition: $parts[2]<br />\n";

Then you could break apart the Japanese part based on whether there
are or aren't the delimiters for the "reading" -- they looked kinda
like parentheses before my ascii-centric email munged them.

You might even be able to combine it all into one big preg_match if
you worked at it.

-- 
Like Music?
http://l-i-e.com/artists.htm

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux