Oh, and i forgot about this one ... jorge at seisbits dot com wrote on 11-Jul-2008 09:04 If you try to make a strtr of not usual charafters when you are in a utf8 enviroment, you can do that: function normaliza ($string){ $string = utf8_decode($string); $string = strtr($string, utf8_decode(" ÂÊÎÔÛÀ"), "-AEIOU"); $string = strtolower($string); return $string; } On Tue, Jul 15, 2008 at 11:38 AM, Yeti <yeti@xxxxxxxxxx> wrote: > I dont think using all these regular expressions is a very efficient way to > do so. As i previously pointed out there are many users who had a similar > problem, which can be viewed at: > > http://it.php.net/manual/en/function.strtr.php > > One of my favourites is what derernst at gmx dot ch used. > > derernst at gmx dot ch > wrote on 20-Sep-2005 07:29 > This works for me to remove accents for some characters of Latin-1, Latin-2 > and Turkish in a UTF-8 environment, where the htmlentities-based solutions > fail: > > <?php >> > function remove_accents($string, $german=false) { > > // Single letters > > $single_fr = explode(" ", "� � � � � � Ą Ă � Ć Č > Ď Đ � � � � � Ę Ě Ğ � � � � İ Ł Ľ > Ĺ � Ń Ň � � � � � � Ő Ŕ Ř � Ś Ş > Ť Ţ � � � � Ů Ű � � Ź Ż � � � � � � ą > ă � ć č ď đ � � � � ę ě ğ � � � � > ı ł ľ ĺ � ń ň � � � � � � � ő ŕ > ř ś � ş ť ţ � � � � ů ű � � � ź > ż"); > > $single_to = explode(" ", "A A A A A A A A C C C D D D E E E E E E G I I > I I I L L L N N N O O O O O O O R R S S S T T U U U U U U Y Z Z Z a a a a a > a a a c c c d d e e e e e e g i i i i i l l l n n n o o o o o o o o r r s s > s t t u u u u u u y y z z z"); > > $single = array(); > > for ($i=0; $i<count($single_fr); $i++) { > > $single[$single_fr[$i]] = $single_to[$i]; > > } > > // Ligatures > > $ligatures = array("�"=>"Ae", "�"=>"ae", "�"=>"Oe", "�"=>"oe", > "�"=>"ss"); > > // German umlauts > > $umlauts = array("�"=>"Ae", "�"=>"ae", "�"=>"Oe", "�"=>"oe", "�"=>"Ue", > "�"=>"ue"); > > // Replace > > $replacements = array_merge($single, $ligatures); > > if ($german) $replacements = array_merge($replacements, $umlauts); > > $string = strtr($string, $replacements); > > return $string; > > } > > ?> > > I would change this function a bit ... > > <?php > //echo rawurlencode("áàéèíìóòúùÁÀÉÈÍÌÓÒÚÙ"); // RFC 1738 codes; NOTE: One > might use UTF-8 as this documents encoding > function remove_accents($string) { > $string = rawurlencode($string); > $replacements = array( > '%C3%A1' => 'a', > '%C3%A0' => 'a', > '%C3%A9' => 'e', > '%C3%A8' => 'e', > '%C3%AD' => 'i', > '%C3%AC' => 'i', > '%C3%B3' => 'o', > '%C3%B2' => 'o', > '%C3%BA' => 'u', > '%C3%B9' => 'u', > '%C3%81' => 'A', > '%C3%80' => 'A', > '%C3%89' => 'E', > '%C3%88' => 'E', > '%C3%8D' => 'I', > '%C3%8C' => 'I', > '%C3%93' => 'O', > '%C3%92' => 'O', > '%C3%9A' => 'U', > '%C3%99' => 'U' > ); > return strtr($string, $replacements); > } > //echo remove_accents("CÀfé"); // I know it's not spelled right > echo remove_accents("áàéèíìóòúùÁÀÉÈÍÌÓÒÚÙ"); //OUTPUT (again: i used UTF-8 > for document): aaeeiioouuAAEEIIOOUU > ?> > > Ciao > > Yeti > On Mon, Jul 14, 2008 at 8:20 PM, Andrew Ballard <aballard@xxxxxxxxx> > wrote: > >> On Mon, Jul 14, 2008 at 1:35 PM, Giulio Mastrosanti >> <giulio@xxxxxxxxxxxxx> wrote: >> >> >> > >> > Brilliant !!! >> > >> > so you replace every occurence of every accent variation with all the >> accent >> > variations... >> > >> > OK, that's it! >> > >> > only some more doubts ( regex are still an headhache for me... ) >> > >> > preg_replace('/[iìíîïĩīĭįı]/iu',... -- what's the meaning of iu after >> the >> > match string? >> >> This page explains them both. >> http://us.php.net/manual/en/reference.pcre.pattern.modifiers.php >> >> > preg_replace('/[aàáâãäåǻāăą](?!e)/iu',... whats (?!e) for? -- every >> > occurence of aàáâãäåǻāăą NOT followed by e? >> >> Yes. It matches any character based on the latin 'a' that is not >> followed by an 'e'. It keeps the pattern from matching the 'a' when it >> immediately precedes an 'e' for the character 'ae' for words like >> these: >> >> http://en.wikipedia.org/wiki/List_of_words_that_may_be_spelled_with_a_ligature >> (However, that may cause problems with words that have other variants >> of 'ae' in them. I'll leave that to you to resolve.) >> http://us.php.net/manual/en/regexp.reference.php >> >> >> >> > Many thanks again for your effort, >> > >> > I'm definitely on the good way >> > >> > Giulio >> > >> > >> >> >> >> I was intrigued by your example, so I played around with it some more >> >> this morning. My own quick web search yielded a lot of results for >> >> highlighting search terms, but none that I found did what you're >> >> after. (I admit I didn't look very deep.) I was up to something like >> >> this before your reply came in. It's still by no means complete. It >> >> even handles simple English plurals (words ending in 's' or 'es'), but >> >> not variations that require changing the word base (like 'daisy' to >> >> 'daisies'). >> >> >> >> <?php >> >> function highlight_search_terms($phrase, $string) { >> >> $non_letter_chars = '/[^\pL]/iu'; >> >> $words = preg_split($non_letter_chars, $phrase); >> >> >> >> $search_words = array(); >> >> foreach ($words as $word) { >> >> if (strlen($word) > 2 && !preg_match($non_letter_chars, $word)) { >> >> $search_words[] = $word; >> >> } >> >> } >> >> >> >> $search_words = array_unique($search_words); >> >> >> >> foreach ($search_words as $word) { >> >> $search = preg_quote($word); >> >> >> >> /* repeat for each possible accented character */ >> >> $search = preg_replace('/(ae|æ|ǽ)/iu', '(ae|æ|ǽ)', $search); >> >> $search = preg_replace('/(oe|œ)/iu', '(oe|œ)', $search); >> >> $search = preg_replace('/[aàáâãäåǻāăą](?!e)/iu', >> >> '[aàáâãäåǻāăą]', $search); >> >> $search = preg_replace('/[cçćĉċč]/iu', '[cçćĉċč]', $search); >> >> $search = preg_replace('/[dďđ]/iu', '[dďđ]', $search); >> >> $search = preg_replace('/(?<![ao])[eèéêëēĕėęě]/iu', >> >> '[eèéêëēĕėęě]', $search); >> >> $search = preg_replace('/[gĝğġģ]/iu', '[gĝğġģ]', $search); >> >> $search = preg_replace('/[hĥħ]/iu', '[hĥħ]', $search); >> >> $search = preg_replace('/[iìíîïĩīĭįı]/iu', '[iìíîïĩīĭįı]', >> $search); >> >> $search = preg_replace('/[jĵ]/iu', '[jĵ]', $search); >> >> $search = preg_replace('/[kķĸ]/iu', '[kķĸ]', $search); >> >> $search = preg_replace('/[lĺļľŀł]/iu', '[lĺļľŀł]', $search); >> >> $search = preg_replace('/[nñńņňʼnŋ]/iu', '[nñńņňʼnŋ]', $search); >> >> $search = preg_replace('/[oòóôõöōŏőǿơ](?!e)/iu', >> >> '[oòóôõöōŏőǿơ]', $search); >> >> $search = preg_replace('/[rŕŗř]/iu', '[rŕŗř]', $search); >> >> $search = preg_replace('/[sśŝşš]/iu', '[sśŝşš]', $search); >> >> $search = preg_replace('/[tţťŧ]/iu', '[tţťŧ]', $search); >> >> $search = preg_replace('/[uùúûüũūŭůűųǔǖǘǚǜ]/iu', >> >> '[uùúûüũūŭůűųǔǖǘǚǜ]', $search); >> >> $search = preg_replace('/[wŵ]/iu', '[wŵ]', $search); >> >> $search = preg_replace('/[yýÿŷ]/iu', '[yýÿŷ]', $search); >> >> $search = preg_replace('/[zźżž]/iu', '[zźżž]', $search); >> >> >> >> >> >> $string = preg_replace('/\b' . $search . '(e?s)?\b/iu', '<span >> >> class="keysearch">$0</span>', $string); >> >> } >> >> >> >> return $string; >> >> >> >> } >> >> ?> >> >> >> >> I still can't help feeling there must be some better way, though. >> >> >> >>> >> >>> well, i think I'm on the good way now, unfortunately I have some other >> >>> urgent work and can't try it immediately, but I'll let you know :) >> >>> >> >>> thank you! >> >>> >> >>> Giulio >> >> >> >> >> >> Andrew >> >> >> >> >> > >> > >> > >