Re: Trigger Performance

Alban Hertroys <dalroi@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> · Sun, 16 Jan 2011 10:43:20 +0100

On 16 Jan 2011, at 5:38, Randall Smith wrote:

>> But you already do have an index on that id-field, so what's the problem with using a unique constraint? Its unique index could just replace the existing one.
> 
> I'm OK with indexing the 8 byte integer, but I don't want to index the
> text field that could possibly be up to 1k.  In some simple tests,
> indexing both fields, the index was larger than the table.

Ah, it was about a different field - that makes much more sense :)

You probably don't need the full width of that field to be fairly certain that it's going to be unique from that point on anyway. The chance that a unique constraint kicks in when it shouldn't gets lower the wider your index on that field gets, but it's probably sufficiently low with relatively few characters, while it's probably not that big a deal if it kicks in early in a few cases.

For example, if you only index the first 100 bytes of each of these fields, you are very unlikely to run into a situation where your constraint claims the field is not unique while it actually is.

That shrinks your index down by a factor of 10, and IMO 100 bytes is still on the wide side for a unique text column. I don't think many people successfully enter 100 characters in a row without ever making an error.

If instead the field contains computer-generated data, then there surely is an easier way to detect that this text is not going to be unique. You could store the values of the parameters used to generate that data for example, and put a unique constraint on those.

I'd also wonder what the value is of two distinct records with the same 1000 characters of text up to the last one. Are those distinct on purpose or is it more likely that someone made an error entering that field and accidentally managed to create a unique entry while in fact the unique constraint was supposed to kick in?

Besides that, you say there will be billions of records, but that text field needs to be unique? I really do hope the contents of that field will be computer-generated, or your users will get quite desperate coming up with new values to use after a while, especially if the contents need to make any sense.

But of course much depends on the nature of the data in your field. You haven't said anything about that.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.

!DSPAM:737,4d32bdf511764853411139!

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general