On Thu, Sep 10, 2015 at 12:56 PM, Peter Geoghegan <peter.geoghegan86@xxxxxxxxx> wrote:
On Thu, Sep 10, 2015 at 12:51 PM, Ken Tanzer <ken.tanzer@xxxxxxxxx> wrote:
> OK, can one of you help me out in understanding this? I would have thought that given "CLARK," and "CLARKE" that the comma would get compared against the E and come first. End of story, before we even get to anything farther in the string. What am I missing?
That's only how it works with the C locale. Otherwise, there are
complicated rules to weigh things like space and punctuation (and
accents/diacritics) less prominently than primary alphabetical
ordering. This is often useful. Anyway, based on what you say here, I
think you should actually "ORDER BY name_last, name_first".
--
Regards,
Peter Geoghegan
What I mailed out what a boiled down example. In reality, what I have is a ton of tables with a client_id in them, and a convenience function client_name(client_id) that returns the name_last, name_first string (plus an alias if it exists). client_name is used all over the place in both views and in an app that uses the database. There is a similar, also heavily used, staff_name function. Eliminating the use of these functions is a non-starter for me--I'd much rather live with the existing sort behavior, especially as no one has ever pointed this out despite over a decade of use.
I'm hoping to sort change this behavior with as minimal a change as possible (e.g., minimal potential for unexpected side effects or breakage). I was hoping to just add a COLLATE "C" within the function:
CREATE OR REPLACE FUNCTION client_name( cid int4 ) RETURNS text AS $$
-- client is a view that draws from tbl_client. name_full is the field with the name_last, name_first data in it
SELECT name_full COLLATE "C" FROM client WHERE client_id=$1;
$$
but that seems to have no effect. And sure enough the documentation seems to back that up. ("The collation assigned to a function or operator's combined input expressions is also considered to apply to the function or operator's result, if the function or operator delivers a result of a collatable data type.") So this may be wishful thinking, but is there any other way to specify the collation of a function result? Specifying the collation every time the function is used is likely a no-go for me too.
Alternatively, it seems I could create new databases with a C collation and then move my data into them. This seems a bit drastic, although possible. I'd again be worried about the breakage/side effects. And actually, will this work? (i.e., can you use pg_dump to populate a new database with a different locale?)
Are there any other potential solutions, pitfalls or considerations that come to mind? Any thoughts welcome. And as I said, if there's not a good way to do this I'll probably leave it alone. Thanks.
Ken
AGENCY Software
A Free Software data system
By and for non-profits
(253) 245-3801
learn more about AGENCY or
follow the discussion.