Search Postgresql Archives

Re: Sequence vs UUID

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Then it's not a Type 4 UUID, which is perfectly fine; just not random.

Also, should now() be replaced by clock_timestamp(), so that it can be called multiple times in the same transaction?

On 1/28/23 21:28, Miles Elam wrote:
On Sat, Jan 28, 2023 at 6:02 PM Ron <ronljohnsonjr@xxxxxxxxx> wrote:
Type 4 UUIDs are sub-optimal for big table because cache hit rates drop through the floor.

This lesson was burned into my psyche waaaay back in the Clinton administration.  It was my task to speed up a five hour batch job which read input records from a flat file, did some validations and then inserted them.  Simply sorting the input file on the primary key fields -- we used natural keys, not synthetics -- dropped the run time to two hours.  (VMS SORT saved the day, because you could tell it the sort order you wanted; thus, I could preserve the header record at the top of the file, and the trailer record at the end of the file without jumping through a bunch of hoops.)
This can be mitigated with judicious use of a sequence at the front of
the uuidv4.

     https://www.2ndquadrant.com/en/blog/sequential-uuid-generators/

More effort than just calling the built-in gen_random_uuid() or
equivalent in app code, but a substantial performance gain for your
effort.

     https://github.com/tvondra/sequential-uuids

And in a managed environment where you can't install custom
extensions, a fairly simple function with divide on unix epoch seconds
combined with a call to overlay(...) should suffice performance-wise.
At 60 seconds, this will loop every 45 days or so, and you can choose
how much "sequentialness" works for you, from 1 to 4 bytes at the
expense of pseudo-randomness.

-----------------------------

-- Generate time interval UUID
CREATE OR REPLACE FUNCTION gen_interval_uuid(interval_seconds int4 =
60, block_num_bytes int2 = 2)
RETURNS uuid LANGUAGE sql VOLATILE PARALLEL SAFE AS $$
   SELECT encode(
     overlay(
       -- convert the uuid to byte array
       uuid_send(gen_random_uuid())
       -- use only the bottom bytes
       PLACING substring(
           int4send((extract(epoch FROM now()) / interval_seconds)::int4)
           FROM (5 - block_num_bytes)
       )
       -- place at the front two bytes of the uuid
       FROM 1
     )
     -- convert the resulting byte array to hex for conversion to uuid
     , 'hex')::uuid
   WHERE interval_seconds > 0 AND block_num_bytes BETWEEN 1 AND 4
$$;

-----------------------------

Technically affecting the v4 spec. You could always convert to a
UUIDv8, which is the intention behind that new version even though the
standard hasn't been ratified yet.


Cheers,

Miles Elam



--
Born in Arizona, moved to Babylonia.





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux