Standard uuid vs. custom data type uuid_v1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I have finally found some time to implement a custom data type optimized
for version 1 UUID's (timestamp, clock sequence, node):
https://github.com/ancoron/pg-uuid-v1

Some tests (using a few millions of rows) have shown the following
results (when used as a primary key):

COPY ... FROM: ~7.8x faster (from file - SSD)
COPY ... TO  : ~1.5x faster (no where clause, sequential output)

The best thing is that for INSERT's there is a very high chance of
hitting the B-Tree "fastpath" because of the timestamp being the most
significant part of the data type, which tends to be increasing.

This also results in much lower "bloat", where the standard "uuid" type
easily goes beyond 30%, the "uuid_v1" should be between 10 and 20%.

Additionally, it also reveals the massive performance degrade I saw in
my tests for standard UUID's:

Initial 200 million rows: ~ 80k rows / second
Additional 17 million rows: ~26k rows / second

...and the new data type:
Initial 200 million rows: ~ 623k rows / second
Additional 17 million rows: ~618k rows / second


The data type also has functions to extract the three parts and has an
additional set of operators to compare it to timestamps for time-series
queries.

Other test results which are of interest:

ANALYZE: essentially equal with uuid_v1 to be just slightly faster
VACUUM: ~4-5x faster
REINDEX: only slightly faster (~10-20%)


I think there's also something from it for a faster standard UUID
implementation as a char array is not very compute friendly and
conversion from or to strings (in/out) can be optimized.


Cheers,

	Ancoron

Ref:
Latest test results:
https://gist.github.com/ancoron/d5114b0907e8974b6808077e02f8d109





[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux