On 05/15/2017 12:48 PM, Xavier Hernandez wrote: >> [ snip ] >> Also, I have a question, What are the chances of uuid collision if we > take just 3 bits from the first byte ? >> >> 000 - Unspecified (can be anything). >> 001 - Directory >> 010 - Regular File >> 011 - Special files (symlink, Block and Char devices, socket files etc). >> {100 - 111} - Reserved. > > This cannot be done. Since we are currently using random UUIDs, on > average, one of every eight randomly generated ids will start with each > one of the combinations. > > Already existing GFIDs will be a problem when updating. The only thing > that can avoid the problem is to create new GFIDs in a format that won't > collide with existing ones, and this can only be done safely if we use > the special fields of the UUID itself. > >> >> As a side-effect, it reduces the number of directories created at as > the metadata, inside of .glusterfs directory. (Will be 50% of current > load). > > Maybe we can find a better way to store the GFIDs using the standard > fields instead of relying on the first bits, which is not a valid solution. > > We can think more about this. How about using a variation of Version 5 UUIDs? Or define our own Version 6? Strictly speaking, Version 5 hashes a NamespaceUUID + Name. That won't work as we'd have too many collisions in the Name part. Instead we could hash NamespaceUUID + Time + Name; or we could just use Time, like a Version 1 UUID; or random bits, like a Version 4 UUID. And store the bits described above in the clock-seq-low part of the GFID. E.g.: 74738ff5-5367-5958-91ee-98fffdcd1876 ^ 5 indicates Version 5 ^ required for Type 5 first two bits set to 1 and 0 ^ 0001 for directory -- Kaleb _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-devel