Matthew Wilson wrote: > I have a table MESSAGE with a text column and a timestamp column. I > want to make sure that I have no rows with duplicates for both values. > I have two ideas right now for how to do this. > > IDEA 1: > > CREATE UNIQUE INDEX nodups on MESSAGE (my_timestamp_col, my_text_col); > > IDEA 2: > > CREATE UNIQUE INDEX nodups on MESSAGE (my_timestamp_col, md5(my_text_col)); > > I am speculating that an index on the md5 is cheaper than on a text > column. I'm willing to risk the chance of a hash collision. > > I don't want to use this index to allow searching inside the text > column. I just want to protect against duplication. > > Are either of these approaches any good? What are other ways to > guarantee uniqueness for the pair of my timestamp column and my text > column? Idea 1 will only work if your text columns never grow too long - I can't recall what the limit is off the top of my head but PG will issue warnings when you reach it. Idea 2 works. You can even use it for exact match searching if you like if you do WHERE my_timestamp_col='...' AND md5(my_text_col)=md5('...') Depending on your app, you might want to round your timestamps to avoid getting two identical messages in the system only a millisecond apart. -- Stuart Bishop <stuart@xxxxxxxxxxxxxxxx> http://www.stuartbishop.net/
Attachment:
signature.asc
Description: OpenPGP digital signature