I was just reading over a reply from David Fetter from a couple of days ago; the thread is archived[1] but this question doesn't really relate to it much. The a question about how to arrange tables and David make the following comments: On Tue, Feb 17, 2009 at 09:53:00AM -0800, David Fetter wrote: > On Tue, Feb 17, 2009 at 04:40:58PM +0000, Sam Mason wrote: > > > user_name varchar(50) NOT NULL, > > > > As a general design question; should user_name have a UNIQUE > > constraint on it? i.e. > > > > user_name VARCHAR(50) NOT NULL UNIQUE, > > Yes, it's good to have a UNIQUE constraint, but not this one. To have > a sane one, it needs further constraints, and in 8.4, case-insensitive > text (citext) type. Here's one that is reasonably sane until citext > is available. > > user_name TEXT, -- unless length is an integrity constraint, use TEXT instead of VARCHAR. > > then later: > > CREATE UNIQUE INDEX unique_user_name_your_table > ON your_table(LOWER(TRIM(user_name))) > > You might also require that whitespace be treated in some consistent > way, one example of which is simply forbidding whitespace in user_name > at all. This you can do via CHECK constraints or a DOMAIN. The reason behind this appears to be moving some of the checks into the database and away from the application. When I've solved similar problems before, I've tended to make the application more aware of what's going on by having something like: user_name VARCHAR(50) NOT NULL UNIQUE CHECK (user_name ~ '^[a-z][a-z0-9_]*$') I explicitly don't want enormous long usernames, 15 characters should be enough but lets give people something to play with if they want. It's partly in case I want to ever get the code interacting with something like C and want to reduce my exposure to buffer overflows as much as possible (yes, I'll still be writing code that should be safe from buffer overflows but I'm human and bugs occur) and there's also the fact that a 1MB username is going to probably be copied around the place pretty freely because the code isn't expecting it to be big. Then there's layout issues, displaying a bit of text that long is awkward. Next thing, just a plain UNIQUE constraint. It allows me to then use the table as the target of a REFERENCES constraint if I want. If/when the optimizer knows that UNIQUE constraints mean that only a single row is returned then it'll be able to optimize things better as well. Finally the CHECK constraint is filtering out "bad" usernames, I don't want people embedding HTML or whatever else in their username to break my systems so plain text only here and no spaces at the end. I don't think that either my nor David's is better in general, they apply to different situations. It's just interesting to see how different people solve problems so I was wondering if other people do things differently. -- Sam http://samason.me.uk/ [1] http://archives.postgresql.org/pgsql-general/2009-02/msg00770.php -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general