Search Postgresql Archives

Re: How to generate unique hash-type id?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 29/01/2010 4:20 PM, Joe Kramer wrote:
Hello,

I need to generate unique id which is not guessable unlike
serial(integer) type. I need an id in format like md5 hash of random
number.
On top of that I need this id to be unique across multiple tables.

Anyone had to solve this problem before? Can you post any recipes or
best practices please?

My questions:

1. Avoiding collisions.
If I make an UNIQUE constraint and do generation of id triggered on
INSERT. What if collision happens? DO I nee d to check if unique hash
already exists and if not- regenerate.
This looks too primitive. Is there a readily available function or
methodology to do that?

2. Generating global unique id across multiple tables.
How to do that? My only idea is to have separate table to keep all
hashes and compare for collision against that table.
Is there a better way? Maybe by creating some special serial type that
is not integer but varchar?

3. what function to use to generate 64-bit random hash without much
overhead to CPU?

When I ran into something somewhat akin to this I asked the list about a non-repeating pseudo-random mapping function. Daniel Verite (on this list) enlightened me about Feistel networks/cyphers, and even posted a PL/PgSQL implementation!

It's documented here:

http://wiki.postgresql.org/wiki/Pseudo_encrypt

and has been extremely handy. It should fit you needs - just define a sequence that you pull new input values from, and use that same sequence across all tables that need unique values.

It's not trivial to assert, across multiple tables, that a value is unique. You could do it with a trigger that checks each table that uses such values (slow-ish but effective) or maintains a side table of values in use. In either case it'd have to be added to every table that used the pseudo_encrypt values.

By the way, if you intend to expose these to users you might also want some kind of data entry error checking so that a typo can't accidentally transform id `n' to id `m' with transposition of a single digit or the like. The luhn algorithm provides a good way to do that with a simple check digit - it's not cryptographically strong in that if you know the values are checked with the luhn algorithm it's trivial to re-generate the check digit, but it helps a log against casual scanning of the number space and against accidental user data entry error. Check out:

http://wiki.postgresql.org/wiki/Luhn_algorithm

--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux