On 2/4/20 11:36 UTC, Rob Wilton (rwilton) wrote: > What you describe does sound to me like it could be a new form of > UUID (if limited to a 128 bit format), and it could potentially also > be useful. E.g. a 128 bit UUID that has good database locality > properties and minimizes the leakage of private information sounds > useful if it can be reasonably specified and implemented. > > I also note that RFC 4122 is 15 years old, and as Martin previously > indicated there are security and privacy considerations that have > evolved over time, hence updating RFC 4122 to make readers aware of > those considerations also seems like it could potentially be useful. > > Writing this up as a draft sounds like a good next step to see if > there is enough wider interest. FYI, Brad has submitted his first draft for review. You can see it here: https://datatracker.ietf.org/doc/draft-peabody-dispatch-new-uuid-format/ I've been following this for a while, and as the author of a popular userland UUID library for PHP <https://github.com/ramsey/uuid>, I'd like to throw my support behind this proposal and describe a few of the pain points that have led application developers down the path of modifying the existing UUID structure to better suite their needs. As a standard, the UUID format is ubiquitous and portable. Despite some of its shortcomings, and the desire (as some have raised on this list) to create a new standard other than UUID, it's a desirable format, for many reasons. There is one primary shortcoming that results in a frequent need to modify the format, and this is the shortcoming that Brad's version 6 UUID attempts to overcome. When developers begin storing UUIDs in relational databases, they inevitably arrive at one or all of these articles (which I'm surprised haven't yet been mentioned in this thread): * http://www.informit.com/articles/printerfriendly/25862 * https://blog.codinghorror.com/primary-keys-ids-versus-guids/ * https://www.percona.com/blog/2014/12/19/store-uuid-optimized-way/ As a result, in my PHP library, I have implemented alternate _codecs_ to encode/decode UUIDs in more optimal ways for database fields, especially for use as primary keys. Two of these codecs are: * Timestamp-first COMB * Ordered Time UUID The timestamp-first COMB is a version 4 UUID combined with a Unix timestamp as the first 48 bits, resulting in a monotonically-increasing UUID. For all intents and purposes, the resulting value always looks like a version 4 UUID (the version and variant bits remain in the same places as defined by RFC 4122). The ordered time UUID is similar but retains the semantics of the version 1 UUID. That is, the UUID can be deconstructed to produce a node value, clock sequence value, and timestamp with nanosecond fidelity. The difference is that the timestamp is rearranged so that the UUID is monotonically increasing. The problem with this approach, though, is that the first 2 bytes are the same as the time_hi_and_version field, which means the UUID version now occupies the first 4 bits of the UUID. Unless you know how the bits of this UUID were rearranged, there's no way to reliably tell that it was originally a version 1 UUID. Therein lies the problem. The use-case is for a version 1 UUID, from which an application can retrieve nanosecond timestamp and node values, while being monotonically increasing so that it does not scatter the records in my database engine. But, by rearranging the bits to achieve this, I'm placing a dependency on my application to know how to deconstruct the bits when retrieving from the database. It's not very portable, error-prone, and can lead to developer confusion. Brad's version 6 UUID solves this problem. There are two primary issues I have with the current draft (I have many other comments, but I want to start with these two, and I'm also unsure how IETF discussion on drafts proceeds, so I'm eager to learn from others): 1. The draft doesn't appear to go into detail about the arrangement of the bits and how the timestamp should be split to accommodate the version field, while the earlier version (posted here: <http://gh.peabody.io/uuidv6/>) does go into this detail. 2. IMO, I think the alternate text formats do not belong in this document. I think this document should focus on the version 6 UUID, and the alternate text formats can be defined in a separate document. The ULID spec seems like a good specification to draw inspiration from, since it's compatible with any 128-bit number and already has a number of implementations. <https://github.com/ulid/spec> Cheers, Ben P.S. Yes, I am aware of privacy concerns with the use of the node field in version 1 UUIDs. I'm happy to discuss potential use-cases of the node field that can be used to track where a UUID was minted without revealing potentially private information, but I don't think the mechanism for creating the node field should be part of this draft.
Attachment:
signature.asc
Description: OpenPGP digital signature