On Jan 24, 2008 11:02 PM, brian <brian@xxxxxxxxxxxxxxxx> wrote: > The client for a web application I'm working on wants certain URLs to > contain the full names of members ("SEO-friendly" links). Scripts would > search on, say, a member directory entry based on the name of the > member, rather than the row ID. I can easily join first & last names > with an underscore (and split on that later) and replace spaces with +, > etc. But many of the names contain multibyte characters and so the URLs > would become URL-encoded, eg: > > Adelina España -> Adelina_Espa%C3%B1a > > The client won't like this (and neither will I). > > I can create a conversion array to replace certain characters with > 'normal' ones: > > Adelina_Espana > > However, I then run into the problem of trying to match 'Espana' to > 'España'. Searching online, I found a few ideas (soundex, intuitive > fuzzy something-or-other) but mostly they seem like overkill for this > application. > > The best I can come up with is to add a 'link_name' column to the table > that holds the 'normalised' version of the name ('Adelina_Espana', or > even 'adelina_espana'). The duplication bugs me a little but the table > currently stands at a whopping ~3500 names, so I'm not too concerned. > > My question is: well, does this look like the way to go, considering > it's just a web app (and isn't likely to ever top 10000 names)? Or is > there something clever (yet not overkill) that I'm missing? > > If I do go this route, I'd add an insert/update trigger to call a > function (PL/Perl, I'm looking at you) that handles the conversion to > link_name. You could create an immutable function to convert characters from accented to normalized, then index on that function. select normalized_name(firstname||'_'||lastname) from sometable where normalized_name(firstname||'_'||lastname) = 'adelina_espana' kind of thing. ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match