I tried to test quickly below on dbfiddle, below with the UUID as data type and in each of the below cases the UUID performance seems drastically reduced as compared to sequence performance. Let me know if anything is wrong in my testing here?
1) sequence generation vs UUID generation, execution time increased from ~291ms to 5655ms.
2) Insert performance of "sequence" vs "UUID" execution time increased from ~2031ms to 10599ms.
3) Index performance for sequence vs UUID, execution time increased from ~.3ms to .5ms.
PostgreSQL 15.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-10), 64-bit
CREATE UNLOGGED TABLE test_bigint ( id bigint PRIMARY KEY);
CREATE UNLOGGED TABLE test_uuid ( id uuid PRIMARY KEY);
create sequence myseq cache 32767;
************ sequence generation vs UUID generation Test**************
explain analyze select count(nextval('myseq') ) from generate_series(1,100000);
create sequence myseq cache 32767;
************ sequence generation vs UUID generation Test**************
explain analyze select count(nextval('myseq') ) from generate_series(1,100000);
QUERY PLAN
Aggregate (cost=1500.00..1500.01 rows=1 width=8) (actual time=291.030..291.030 rows=1 loops=1)
-> Function Scan on generate_series (cost=0.00..1000.00 rows=100000 width=0) (actual time=53.332..63.941 rows=100000 loops=1)
Planning Time: 0.155 ms
Execution Time: 291.719 ms
explain analyze select count(gen_random_uuid()) from generate_series(1,100000);
QUERY PLAN
Aggregate (cost=1500.00..1500.01 rows=1 width=8) (actual time=5654.453..5654.454 rows=1 loops=1)
-> Function Scan on generate_series (cost=0.00..1000.00 rows=100000 width=0) (actual time=84.328..514.214 rows=100000 loops=1)
Planning Time: 0.082 ms
Execution Time: 5655.158 ms
*************** Insert Test***************
explain analyze INSERT INTO test_bigint select nextval('myseq') from generate_series(1,100000);
QUERY PLAN
Insert on test_bigint (cost=0.00..2250.00 rows=0 width=0) (actual time=2030.960..2030.961 rows=0 loops=1)
-> Function Scan on generate_series (cost=0.00..1250.00 rows=100000 width=8) (actual time=48.102..636.311 rows=100000 loops=1)
Planning Time: 0.065 ms
Execution Time: 2031.469 ms
explain analyze INSERT INTO test_uuid select gen_random_uuid() from generate_series(1,100000);
QUERY PLAN
Insert on test_uuid (cost=0.00..2250.00 rows=0 width=0) (actual time=10599.230..10599.230 rows=0 loops=1)
-> Function Scan on generate_series (cost=0.00..1250.00 rows=100000 width=16) (actual time=36.975..6289.811 rows=100000 loops=1)
Planning Time: 0.056 ms
Execution Time: 10599.805 ms
************ Index performance********
Explain analyze select * from test_bigint where id in (select id from test_bigint limit 100);
QUERY PLAN
Nested Loop (cost=2.98..734.71 rows=100 width=8) (actual time=0.083..0.269 rows=100 loops=1)
-> HashAggregate (cost=2.69..3.69 rows=100 width=8) (actual time=0.046..0.062 rows=100 loops=1)
Group Key: test_bigint_1.id
Batches: 1 Memory Usage: 24kB
-> Limit (cost=0.00..1.44 rows=100 width=8) (actual time=0.011..0.025 rows=100 loops=1)
-> Seq Scan on test_bigint test_bigint_1 (cost=0.00..1444.18 rows=100118 width=8) (actual time=0.011..0.017 rows=100 loops=1)
-> Index Only Scan using test_bigint_pkey on test_bigint (cost=0.29..7.31 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=100)
Index Cond: (id = test_bigint_1.id)
Heap Fetches: 100
Planning Time: 0.279 ms
Execution Time: 0.302 ms
Explain analyze select * from test_uuid where id in (select id from test_uuid limit 100);
QUERY PLAN
Nested Loop (cost=3.21..783.31 rows=100 width=16) (actual time=0.080..0.474 rows=100 loops=1)
-> HashAggregate (cost=2.79..3.79 rows=100 width=16) (actual time=0.046..0.066 rows=100 loops=1)
Group Key: test_uuid_1.id
Batches: 1 Memory Usage: 24kB
-> Limit (cost=0.00..1.54 rows=100 width=16) (actual time=0.010..0.025 rows=100 loops=1)
-> Seq Scan on test_uuid test_uuid_1 (cost=0.00..1541.85 rows=100085 width=16) (actual time=0.009..0.016 rows=100 loops=1)
-> Index Only Scan using test_uuid_pkey on test_uuid (cost=0.42..7.79 rows=1 width=16) (actual time=0.004..0.004 rows=1 loops=100)
Index Cond: (id = test_uuid_1.id)
Heap Fetches: 100
Planning Time: 0.180 ms
Execution Time: 0.510 ms
Aggregate (cost=1500.00..1500.01 rows=1 width=8) (actual time=291.030..291.030 rows=1 loops=1)
-> Function Scan on generate_series (cost=0.00..1000.00 rows=100000 width=0) (actual time=53.332..63.941 rows=100000 loops=1)
Planning Time: 0.155 ms
Execution Time: 291.719 ms
explain analyze select count(gen_random_uuid()) from generate_series(1,100000);
QUERY PLAN
Aggregate (cost=1500.00..1500.01 rows=1 width=8) (actual time=5654.453..5654.454 rows=1 loops=1)
-> Function Scan on generate_series (cost=0.00..1000.00 rows=100000 width=0) (actual time=84.328..514.214 rows=100000 loops=1)
Planning Time: 0.082 ms
Execution Time: 5655.158 ms
*************** Insert Test***************
explain analyze INSERT INTO test_bigint select nextval('myseq') from generate_series(1,100000);
QUERY PLAN
Insert on test_bigint (cost=0.00..2250.00 rows=0 width=0) (actual time=2030.960..2030.961 rows=0 loops=1)
-> Function Scan on generate_series (cost=0.00..1250.00 rows=100000 width=8) (actual time=48.102..636.311 rows=100000 loops=1)
Planning Time: 0.065 ms
Execution Time: 2031.469 ms
explain analyze INSERT INTO test_uuid select gen_random_uuid() from generate_series(1,100000);
QUERY PLAN
Insert on test_uuid (cost=0.00..2250.00 rows=0 width=0) (actual time=10599.230..10599.230 rows=0 loops=1)
-> Function Scan on generate_series (cost=0.00..1250.00 rows=100000 width=16) (actual time=36.975..6289.811 rows=100000 loops=1)
Planning Time: 0.056 ms
Execution Time: 10599.805 ms
************ Index performance********
Explain analyze select * from test_bigint where id in (select id from test_bigint limit 100);
QUERY PLAN
Nested Loop (cost=2.98..734.71 rows=100 width=8) (actual time=0.083..0.269 rows=100 loops=1)
-> HashAggregate (cost=2.69..3.69 rows=100 width=8) (actual time=0.046..0.062 rows=100 loops=1)
Group Key: test_bigint_1.id
Batches: 1 Memory Usage: 24kB
-> Limit (cost=0.00..1.44 rows=100 width=8) (actual time=0.011..0.025 rows=100 loops=1)
-> Seq Scan on test_bigint test_bigint_1 (cost=0.00..1444.18 rows=100118 width=8) (actual time=0.011..0.017 rows=100 loops=1)
-> Index Only Scan using test_bigint_pkey on test_bigint (cost=0.29..7.31 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=100)
Index Cond: (id = test_bigint_1.id)
Heap Fetches: 100
Planning Time: 0.279 ms
Execution Time: 0.302 ms
Explain analyze select * from test_uuid where id in (select id from test_uuid limit 100);
QUERY PLAN
Nested Loop (cost=3.21..783.31 rows=100 width=16) (actual time=0.080..0.474 rows=100 loops=1)
-> HashAggregate (cost=2.79..3.79 rows=100 width=16) (actual time=0.046..0.066 rows=100 loops=1)
Group Key: test_uuid_1.id
Batches: 1 Memory Usage: 24kB
-> Limit (cost=0.00..1.54 rows=100 width=16) (actual time=0.010..0.025 rows=100 loops=1)
-> Seq Scan on test_uuid test_uuid_1 (cost=0.00..1541.85 rows=100085 width=16) (actual time=0.009..0.016 rows=100 loops=1)
-> Index Only Scan using test_uuid_pkey on test_uuid (cost=0.42..7.79 rows=1 width=16) (actual time=0.004..0.004 rows=1 loops=100)
Index Cond: (id = test_uuid_1.id)
Heap Fetches: 100
Planning Time: 0.180 ms
Execution Time: 0.510 ms
On Tue, 31 Jan 2023 at 03:28, Ron <ronljohnsonjr@xxxxxxxxx> wrote:
And populate that column with UUIDs generated by the gen_random_uuid() function.
(Requires v13.)
On 1/30/23 13:46, Adrian Klaver wrote:
> On 1/30/23 11:43, veem v wrote:
>> Thank You So much for the details. I am a bit new to postgres. And these
>> test results I picked were from a dev system. If I understand it
>> correctly, do you mean these settings(usage of C locale or "native"
>> 16-byte uuid) which you mentioned should be there in a production system
>> and thus we should test the performance of the UUID vs sequence on a
>> similar setup? Or say if this sort of degradation of UUID performance is
>> not expected then , how to get these settings tweaked ON, so as to see
>> the best string type or UUID performance, can you please guide me here?
>
> No what is being said is change:
>
> source_id varchar(36)
>
> to
>
> source_id uuid
>
> as i:
>
> https://www.postgresql.org/docs/current/datatype-uuid.html
>
>>
>> On Mon, 30 Jan 2023 at 22:18, Tom Lane <tgl@xxxxxxxxxxxxx
>> <mailto:tgl@xxxxxxxxxxxxx>> wrote:
>>
>> Dominique Devienne <ddevienne@xxxxxxxxx
>> <mailto:ddevienne@xxxxxxxxx>> writes:
>> > On Mon, Jan 30, 2023 at 5:11 PM veem v <veema0000@xxxxxxxxx
>> <mailto:veema0000@xxxxxxxxx>> wrote:
>> >> CREATE TABLE test1_UUID ( id bigint,source_id varchar(36)
>> PRIMARY KEY, Name varchar(20) );
>>
>> > Maybe if you used a "native" 16-byte uuid, instead of its textual
>> > serialization with dashes (36 bytes + length overhead), the gap would
>> > narrow.
>>
>> Yeah, especially if your database is not using C locale. The
>> strcoll or ICU-based comparisons done on string types can be
>> enormously more expensive than the memcmp() used for binary
>> types like native uuid.
>>
>> regards, tom lane
>>
>
--
Born in Arizona, moved to Babylonia.